Iroh: A Unified Programming Language for the Entire Stack
Why Iroh?
Building production apps today tends to involve multiple languages, e.g. Go for the backend, SQL for the database, TypeScript for the frontend, CSS for styling, etc.
Worse, despite there being hundreds of languages, if you wanted high performance with strong memory safety and minimal runtime overhead, your options are limited, e.g. to Rust and Ada.
But what if we could have it all? A single, minimal language that:
-
Was comparable to C, C++, Rust, and Zig in terms of performance.
-
Provided the memory safety of Rust but without having to constantly work around the constraints of the borrow checker.
-
Was designed for fast compilations and developer productivity like Pascal, Go, and Zig.
-
Was as easy to learn as Python and TypeScript.
-
Could be used across the entire stack: infrastructure, apps, data engineering, styling, and even smart contracts.
This is what we’re creating with Iroh. We will be using it to build both the Espra browser and Hyperchain. Developers will be able to use it to write both onchain scripts and user-facing apps on Espra.
Our only non-goal is running directly on hardware without an OS, e.g. on embedded systems or when writing kernels or drivers. We’ll leave that to Ada, C, C++, Forth, Rust, and Zig.
Execution Modes
Iroh is intended to support 6 primary execution modes:
-
standard
-
The default mechanism that produces an executable binary for an OS and target architecture, e.g.
macos-arm64
. -
This is pretty similar to what you would expect if you were to write some code in a language like Go.
-
This mode has full access to make whatever calls the underlying OS allows, e.g. calling into C functions, making system calls, etc.
-
The
main
function within the root package serves as the entrypoint for the program being run.
-
-
onchain-script
-
A sandboxed execution mechanism with automatic metering of every operation.
-
This mode will have restricted data types, e.g. floating-point types won’t be available as they will necessitate complex circuits when producing zero-knowledge proofs.
-
This mode will also set certain defaults, e.g. treat all decimal-point literals as
fixed128
values, check arithmetic operations for wrapping, default to an arena allocator, etc.
-
-
onchain-ui-app
-
A sandboxed execution mechanism that can only call into the underlying system based on explicit functions that have been passed in by the “host”, e.g. to load and save data.
-
This is somewhat similar to how WASM modules are executed by web browsers.
-
The
App
defined within the root package serves as the entrypoint for the onchain app being run.
-
-
declarative
-
A constrained mode that only allows the declarative aspects of the language.
-
This allows for Iroh to be used in settings like config files, whilst still retaining the richness of expressivity, type validation, IDE support, etc.
-
-
gpu-compute
-
A constrained mode for executing code on a GPU or AI chip.
-
This mode will not allow for certain operations, e.g. dynamic allocation of memory, and will have restricted data types to match what the underlying chip supports.
-
-
script
-
An interpreted mode where Iroh behaves like a dynamic “scripting” language.
-
This will be available when the
iroh
executable is run with no subcommands. This provides the user with a REPL combined with an Iroh-based shell interface. -
The fast feedback mechanism of the REPL will allow for high developer productivity, rapid prototyping, as well as make it easier for people to learn and discover features.
-
Scripts can be run with no compilation by just passing the script name, e.g.
iroh <script-name>.iroh
Or if the script files start with a shebang, i.e.
#!/usr/bin/env iroh
They can be made executable with
chmod +x
and run directly, e.g../script-name.iroh
-
This mode will also be useful in embedded contexts, e.g. scripting for games, notebooks, etc.
-
While the execution mechanism and available functionality will be slightly different in each mode, they will all support the same core language. Making life easy for developers everywhere.
Structured Code Editing
Iroh code is edited through a custom editor that gives the impression of editing text. For example, here’s what “hello world” looks like:
func main() {
print("Hello world")
}
But despite looking like text, behind the scenes, Iroh’s editor updates various data structures about the code as it is being edited. In fact this is where Iroh gets its name from:
- Iroh stands for “intermediate representation of hypergraphs”.
Hypergraphs are just graphs where edges can connect any number of nodes, i.e. are not limited to just connecting two nodes. This gives us a powerful base data structure that can capture:
-
Data flow dependencies.
-
Control flow relationships.
-
Type relationships.
-
Package dependencies.
-
Multi-dimensional relationships that typical structures like trees and normal graphs can’t express.
Perhaps most importantly, where the typical compiler loses semantic information at each step:
Source → Parse → AST → Semantic Analysis → IR → Optimize → Machine Code
Iroh maintains full semantic information throughout. This allows us to improve the developer experience in ways that are not possible with other languages:
-
Instant semantic feedback as you type — going well beyond what IDEs can do today.
-
Refactoring that understands intent, not just syntax.
-
Intelligent error messages and debugging based on the full context.
It also makes Iroh itself a lot simpler to develop:
-
Outside of single-line expressions, we don’t need to worry about any ambiguous or conflicting grammars in the language. We can use simpler syntax without worrying about parsing.
-
There’s no need to build bespoke tooling like LSPs, code formatters, etc. Everything from how the code is presented and edited is just a transformation of the hypergraph.
This gives Iroh a massive competitive advantage:
-
Compiling becomes superfast as most of the work that a typical compiler would do is already done at “edit time”. In essence, compilation is just:
Hypergraph → Optimize → Machine Code
-
All tools work from the same source of truth — providing consistency across the board.
-
As new tools are just new hypergraph transformations, the system can be easily extended with richer and richer functionality.
-
There’s a lot less maintenance work as it’s all effectively just one system instead of lots of separate tools.
The fact that a developer can only be editing one bit of code at any point in time, allows us to:
-
Do deep analysis on just the specific area that is being edited. Since we’re not starting from scratch like a typical compiler, we have the full context to help guide the developer.
-
Provide real-time inference, e.g. create sets of all the different errors that might be generated within a function.
-
Automatically refactor, e.g. detect that a deeply nested component needs an extra field to be passed to it, and make sure that all of its parent callers pass it through.
-
Provide custom editing views for different contexts, e.g. a live simulation interface for UI statecharts, a graph editor for dataflow systems, a picker for changing colour values, etc.
All in all, Iroh can provide a fundamentally better development experience, whilst still keeping it accessible and familiar with its default text-like representation.
Edit Calculus
Iroh’s editor builds on the fantastic work that Jonathan Edwards has been doing from Subtext onwards. At the heart of our editor, we have an edit calculus that:
-
Codifies a broad set of operations on the underlying hypergraph that preserves the intent of changes, whilst providing efficient in-memory and on-disk representations of the data.
-
Allows for “time-travelling” backwards and forwards across changes, whilst maintaining consistency with any changes made concurrently by others.
-
Unlike CRDTs, natively ensures semantic validity and coherence when concurrent changes are merged.
-
Unlike git’s text-based diffs, provides semantic diffs that preserve the meaning of changes made by developers.
This allows for the kind of collaboration on code, intelligent merging, enriched code reviews, and advanced debugging that’s not available in mainstream languages.
Numeric Data Types
Iroh implements the full range of numeric data types that one would expect:
- Integer types
- Floating-point types
- Complex number types
- Decimal types
- Uncertainty types
- Fraction types
All numeric values support the typical arithmetic operators:
- Addition:
x + y
- Subtraction:
x - y
- Multiplication:
x * y
- Division:
x / y
- Negation:
-x
- Modulus/Remainder:
x % y
- Exponentiation:
x ** y
- Percentage:
x%
Signed integer types are available in the usual bit-widths:
Type | Min Value | Max Value |
---|---|---|
int8 |
-128 | 127 |
int16 |
-32768 | 32767 |
int32 |
-2147483648 | 2147483647 |
int64 |
-9223372036854775808 | 9223372036854775807 |
int128 |
-170141183460469231731687303715884105728 | 170141183460469231731687303715884105727 |
Likewise for the unsigned integer types:
Type | Min Value | Max Value |
---|---|---|
uint8 |
0 | 255 |
uint16 |
0 | 65535 |
uint32 |
0 | 4294967295 |
uint64 |
0 | 18446744073709551615 |
uint128 |
0 | 340282366920938463463374607431768211455 |
Some integer types have aliases like in Go:
-
byte
is an alias foruint8
to indicate that the data being processed represents bytes, e.g. in a byte slice:[]byte
. -
int
is aliased to the signed integer type corresponding to the underlying architecture’s bit-width, e.g.int64
on 64-bit platforms,int32
on 32-bit platforms, etc. -
uint
is aliased to the unsigned integer type corresponding to the underlying architecture’s bit-width, e.g.uint64
on 64-bit platforms,uint32
on 32-bit platforms, etc.
Like Zig, Iroh supports arbitrary-width integers for bit-widths between 1
and
65535
when the type name int
or uint
is followed by a bit-width, e.g.
// 7-bit signed integer
a = int7(20)
// 4096-bit unsigned integer
b = uint4096(10715086071862641821530)
If an arbitrary-precision integer is desired, then that is available via the
built-in bigint
type that automatically expands to fit the necessary
precision, e.g.
a = bigint(10715086071862641821530) ** 5000 // would need 365,911 bits
Integer literals can be represented in various formats:
// Decimal
24699848519483
// With underscores for legibility
24_699_848_519_483
// Hex
0x1676e1b26f3b
0X1676E1B26F3B
// Octal
0o547334154467473
// Binary
0b101100111011011100001101100100110111100111011
Integer types support the typical bit operators:
- Bitwise AND:
a & b
- Bitwise OR:
a | b
- Bitwise XOR:
a ^ b
- Bitwise NOT:
^a
- Bitwise AND NOT:
a &^ b
- Left shift:
a << b
- Right shift:
a >> b
Various methods exist on integer types to support bit manipulation, e.g.
-
x.count_one_bits
— count the number of bits with the value1
. -
x.leading_zero_bits
— count the number of leading zeros. -
x.bit_length
— the minimum number of bits needed to represent the value. -
x.reverse_bits
— reverses the bits. -
x.rotate_bits_left
— rotate the value left byn
bits. -
x.to_big_endian
— converts the value from little endian to big endian. -
x.to_little_endian
— converts the value from big endian to little endian. -
x.trailing_zero_bits
— count the number of trailing zeros.
Iroh provides the typical floating-point types:
Type | Implementation |
---|---|
float16 |
IEEE-754-2008 binary16 |
float32 |
IEEE-754-2008 binary32 |
float64 |
IEEE-754-2008 binary64 |
float80 |
IEEE-754-2008 80-bit extended precision |
float128 |
IEEE-754-2008 binary128 |
As well as some additional floating-point types for use cases like machine learning:
Type | Implementation |
---|---|
float8_e4 |
E4M3 8-bit float-point format |
float8_e5 |
E5M2 8-bit float-point format |
bfloat16 |
Brain floating-point format |
Non-finite floating-point values can be constructed using methods on the floating-point types, e.g.
float64.nan() // NaN
float64.inf() // Positive infinity
float64.neg_inf() // Negative infinity
Additional methods on the floating-point values support further manipulation:
-
x.is_nan
— checks if a floating-point value is a NaN. -
x.is_inf
— checks if a floating-point value is infinity. -
x.is_finite
— checks if a floating-point value is not a NaN or infinity. -
x.copy_sign
— copies the sign of a given value.
Complex numbers can be represented using Iroh’s complex number types:
Type | Description |
---|---|
complex64 |
Real and imaginary parts are float32 |
complex128 |
Real and imaginary parts are float64 |
Complex number values can be constructed using the complex
constructor or
literals, i.e.
// Using the complex constructor
x = complex(1.2, 3.4)
// Using literals
x = 1.2 + 3.4i
Similarly, quaternion types are provided for use in domains like computer graphics and physics:
Type | Description |
---|---|
quaternion128 |
Real and imaginary parts are float32 |
quaternion256 |
Real and imaginary parts are float64 |
Quaternions number values can be constructed using the quaternion
constructor
or literals, i.e.
// Using the quaternion constructor
x = quaternion(1.2, 3.4, 5.6, 7.8)
// Using literals
x = 1.2 + 3.4i + 5.6j + 7.8k
Iroh provides 2 fixed-point signed data types for dealing with things like monetary values:
Type | Max Value |
---|---|
fixed128 |
170141183460469231731.687303715884105727 |
fixed256 |
57896044618658097711785492504343953926634992332820282019728.792003956564819967 |
Unlike floating-point values, these fixed-point data types can represent decimal
values exactly, and support up to 18
decimal places of precision.
The smallest number that can be represented is:
0.000000000000000001
The fixed-point types are augmented with a third decimal type:
Type | Implementation |
---|---|
bigdecimal |
Supports arbitrary-precision decimal calculations |
These are a lot slower than fixed-point types as they are allocated on the heap. But, it allows for unbounded scale, i.e. the number of digits after the decimal point, and unlimited range.
When even that’s not enough, and you need exact calculations, there’s a
fraction
type for representing a rational numbers, i.e. a quotient a/b
of
arbitrary-precision integers, e.g.
x = fraction(1, 3)
y = fraction(2, 5)
z = x + y // results in 11/15 exactly
The numerator and denominator of the fraction can be directly accessed:
x = fraction(1, 3)
x.num // 1
x.den // 3
The fraction
value can be converted to a decimal value, e.g. using
bigdecimal
:
x = fraction(1, 3)
bigdecimal(x) // 0.333333333333333333
The exact scale
can also be specified, e.g.
x = fraction(1, 3)
bigdecimal(x, scale: 5) // 0.33333
Iroh also supports uncertainty values that are useful for things like simulations, financial risk analysis, scientific calculations, engineering tolerance analysis, etc. These types:
-
Include a component that represents the level of uncertainty of its value.
-
Propagate the uncertainty level through calculations.
Each of the floating-point and decimal values have an uncertainty variant where
the name of the variant is the underlying type’s name prefixed with u
, e.g.
ufloat64
, ufixed128
, etc.
Uncertainty values can be instantiated using the type constructor or using the
±
literal, e.g.
x = ufloat64(1.2, 0.2)
y = 1.2 ± 0.2
The uncertainty bounds are propagated across calculations, e.g.
x = 1.0 ± 0.1
y = 2.0 ± 0.05
z = x + y // 3.0 ± ~0.11
The value and uncertainty of the resulting value can be accessed directly, e.g.
x = 1.2 ± 0.2
x.value // 1.2
x.uncertainty // 0.2
Numeric literals like 42
, 3.14
, and 1e6
are initially untyped and of
arbitrary precision in Iroh. They remain untyped when assigned to const
values, and are only typed when assigned to variables.
When numeric literals are not typed, by default:
-
Values without decimal places, i.e.
integer_literals
, are inferred asint
values. -
Values with decimal places, i.e.
decimal_point_literals
, are inferred asfixed128
values -
Values with
e
orE
exponents, i.e.exponential_literals
, are inferred asfloat64
values
For example:
// Untyped decimal-point value
const Pi = 3.14159265358979323846264338327950288419716939937510582097494459
// When explicit types aren't specified:
radius = 5.0 // inferred as a fixed128
area = Pi * radius * radius // Pi is treated as a fixed128
// When explicit types are specified:
radius = float32(5.0)
area = Pi * radius * radius // Pi is treated as a float32
The with
statement can be used to control how numeric literals are inferred in
specific lexical scopes, e.g.
with {
.decimal_point_literals = .float32
.integer_literals = .int128
} {
x = 3.14 // x is a float32
y = 1234 // y is an int128
}
Numeric types are automatically upcasted if it can be done safely, e.g.
x = int32(20)
y = int64(10)
z = x + y // x is automatically upcasted to an int64
Otherwise, variables will have to be cast explicitly, e.g.
x = int32(20)
y = int64(10)
z = x + int32(y) // y is explicitly downcasted to an int32
When integers are cast to a floating-point type, a type conversion is performed that tries to fit the closest representation of the integer within the float.
Conversely, when a floating-point value is cast to an integer type, its fractional part is discarded, and the result is the integer part of the floating-point value.
Except for the cases when a type won’t fit, e.g. when adding an int256
and a
fixed128
value, integers are automatically upcasted to fixed-point types when
needed.
Similarly, both integers and fixed-point values are both considered safe for
upcasting into bigdecimal
values, e.g.
x = bigdecimal(1.2)
y = fixed128(2.5)
z = 3 * (x / y) // results in a bigdecimal value of 1.44
By default, the compiler will error if literals are assigned to types outside of
the range for an integer, fixed-point, or bigdecimal
type, e.g.
x = int8(1024) // ERROR!
Likewise, an error is generated at runtime if a value is downcast into a type that doesn’t fit, e.g.
x = int64(1024)
y = int8(x) // ERROR!
Integer types can take an optional policy
parameter to instead either truncate
the value to fit the target type’s bit width, recast the bits, or clamp it to
the type’s min or max value, e.g.
x = int64(4431)
y = int8(x, policy: .truncate) // results in 79
z = int8(x, policy: .clamp) // results in 127
// Recast a uint8 as an int8
x = uint8(200)
y = int8(x, policy: .recast) // results in -56
Non-integer numeric types do not support a custom policy
:
-
Fixed-point and
bigdecimal
values will always generate an error when a value doesn’t fit within its range. -
Floating-point values will silently overflow into either
+inf
or-inf
, or if a value is smaller than the smallest representable subnormal, it will become either+0
or-0
.
Arithmetic operations on integer and fixed-point types are automatically checked, i.e. will raise an error on either overflow or underflow, e.g.
x = uint8(160)
y = uint8(160)
z = x + y // ERROR!
This behaviour can be changed by using the with
statement to set the
.integer_arithmetic_policy
to either .wrapping
, .saturating
, or
.checked
, e.g.
with {
.integer_arithmetic_policy = .wrapping
} {
x = uint8(160)
y = uint8(160)
z = x + y // results in 64
}
The integer types also provides methods corresponding to operations using each of the policy variants for when you don’t want to change the setting for the whole scope, e.g.
x = uint8(160)
y = uint8(160)
z = x.wrapping_add(y) // results in 64
The errors mentioned above, along with the error caused by dividing by zero, can
be caught using the try
keyword, e.g.
x = 1024
y = try int8(x)
Percentage values can be constructed using the %
suffix, e.g.
total = 146.00
vat_rate = 20%
vat = total * vat_rate
vat == 29.20 // true
Numeric types can also be constructed from string values, e.g.
x = int64("910365")
The string value can be of any format that’s valid for a literal of that type, e.g.
x = int32("1234")
y = fixed128("28.50")
z = int8("0xff") // Automatic base inferred from the 0x prefix
An optional base
parameter can be specified when parsing strings to integer
types, e.g.
x = int64("deadbeef", base: .hex)
y = int8("10101100", base: .binary)
Numbers can be rounded using the built-in round
function. By default it will
round to the closest integer value using .half_even
rounding, e.g.
x = 13.5
round(x) // 14
An alternative rounding mode can be specified if desired, e.g.
x = 13.5
round(x, .down) // 13
The following rounding modes are natively supported:
enum {
half_even, // Round to nearest, .5 goes towards nearest even integer
half_up, // Round to nearest, .5 goes away from zero
half_down, // Round to nearest, .5 goes towards zero
up, // Round away from zero
down, // Round towards zero
ceiling, // Round towards positive infinity
floor, // Round towards negative infinity
truncate, // Remove fractional part
}
The number of decimal places can be controlled by an optional scale
parameter, e.g.
pi = 3.1415926
round(pi, scale: 4) // 3.1416
A negative scale makes the rounding occur to the left of the decimal point. This is useful for rounding to the nearest ten, hundred, thousand, etc.
x = 12345.67
round(x, scale: -2) // 12300
The optional significant_figures
parameter can round to a specific number of
significant figures, e.g.
x = 123.456
round(x, significant_figures: 3) // 123
y = 0.001234
round(y, significant_figures: 2) // 0.0012
The round
function works on all numeric types. Note that the rounding of
floating-point values might yield surprising results as most decimal fractions
can’t be represented exactly in floats.
Certain arithmetic operations on the decimal types are automatically rounded. The compiler will avoid re-ordering these so that outputs are deterministic.
By default, rounding for both fixed-point types and bigdecimal
will be to 18
decimal places and using the .half_even
rounding mode. This can be controlled
using the with
statement, e.g.
with {
.decimal_point_literals = .fixed128
.integer_literals = .fixed128
.decimal_round = .floor
.decimal_scale = 4
} {
x = 1/3 // 0.3333
}
Converting numeric values into strings can be done by just casting them into a
string
type, e.g.
x = 1234
string(x) // "1234"
This can take an optional format
parameter to control how the number is formatted, e.g.
x = 1234
string(x, format: .decimal) // "1234", the default
string(x, format: .hex) // "4d2"
string(x, format: .hex_upper) // "4D2"
string(x, format: .hex_prefixed) // "0x4d2"
string(x, format: .hex_upper) // "4D2"
string(x, format: .hex_upper_prefixed) // "0X4D2"
string(x, format: .octal) // "2322"
string(x, format: .octal_prefixed) // "0o2322"
string(x, format: .binary) // "10011010010"
string(x, format: .binary_prefixed) // "0b10011010010"
The optional scale
parameter will pad the output with trailing zeros to match
the desired number of decimal places, e.g.
x = 12.3
string(x, scale: 2) // "12.30"
If something other than the default .half_even
rounding, i.e. bankers
rounding, is desired, then the optional round
parameter can be used:
x = 12.395
string(x, round: .floor, scale: 2) // "12.39"
The optional thousands
parameter can be set to segment the number into
multiples of thousands, e.g.
x = 1234567
string(x, thousands: true) // "1,234,567"
For formatting numbers as they’re expected in different locales, the locale
parameter can be set, e.g.
x = 1234567.89
string(x, locale: .de) // "1.234.567,89"
A with
statement can be used to apply the locale to a lexical scope, e.g.
with {
.locale = .de
} {
// all string formatting in this scope will use the specified locale
}
The specific separators for decimal and thousands can also be controlled
explicitly. The setting of thousands_separator
implicitly sets thousands
to
true
.
x = 1234567.89
string(x, decimal_separator: ",", thousands_separator: ".") // "1.234.567,89"
For a more complete numeric support, the standard library also provides
additional packages, e.g. the math
package defines constants like Pi
,
implements trig functions, etc.
Unit Values
Units can be defined using the built-in unit
type and live within a custom
<unit>
namespace, e.g.
<s> = unit(name: "second", plural: "seconds")
<km> = unit(name: "kilometre", plural: "kilometres")
Numeric values can be instantiated with a specific <unit>
, e.g.
distance = 20<km>
timeout = 30<s>
Unit definitions are evaluated at compile-time, and can be related to each
other via the @relate
function, e.g.
<s> = unit(name: "second", plural: "seconds")
<min> = unit(name: "minute", plural: "minutes")
<hour> = unit(name: "hour", plural: "hours")
@relate(<min>, 60<s>)
@relate(<hour>, 60<min>)
If the optional si_unit
is set to true
during unit definition, then variants
using SI prefixes will be automatically created, e.g.
<s> = unit(name: "second", plural: "seconds", si_unit: true)
// Units like ns, us, ms, ks, Ms, etc. are automatically created, e.g.
1<s> == 1000<ms> // true
Non-linear unit relationships can be defined too, e.g.
<C> = unit(name: "°C")
<F> = unit(name: "°F")
@relate(<C>, ((<F> - 32) * 5) / 9)
// The opposite is automatically inferred, i.e.
<F> == ((<C> * 9) / 5) + 32
Cyclical units with a wrap_at
value automatically wrap-around on calculations,
e.g.
<degrees> = unit(name: "°", wrap_at: 360)
difference = 20<degrees> - 350<degrees>
difference == 30<degrees> // true
Logarithmic units can also be defined, e.g.
<dB> = unit(name: "decibel", plural: "decibels", logarithmic: true, base: 10)
20<dB> + 30<dB> == 30.4<dB> // true
Values with units are of the type quantity
and can also be programmatically
defined. When a function expects a parameter of type unit
, the <>
can be
elided, e.g.
distance = quantity(20, km)
Computations with quantities propagate their units, e.g.
speed = 20<km> / 40<min>
speed == 0.5<km/min> // true
Quantities can be normalized to convertible units, e.g.
speed = 0.5<km/min>
speed.to(km/hour) == 30<km/hour>
The type system automatically prevents illegal calculations, e.g.
10<USD> + 20<min> // ERROR!
While units are automatically calculated on multiplication and division, e.g.
force = mass * acceleration // kg⋅m/s² (newtons)
energy = force * distance // kg⋅m²/s² (joules)
power = energy / time // kg⋅m²/s³ (watts)
Quantities default to using fixed128
values, but this can be customized by
using a different type during the value construction, e.g.
speed = bigdecimal(55.312)<km/s>
measurement = (1.5 ± 0.1)<m>
Quantities can be parsed from string values where a numeric value is suffixed with the unit, e.g.
timeout = quantity("30s")
timeout == 30<s> // true
By default all units that are available in the scope are supported. This can be
constrained by specifying the optional limit_to
parameter, e.g.
block_size_limit = quantity("100MB", limit_to: [MB, GB])
block_size_limit == 0.1<GB> // true
Quantities can also be cast to strings, e.g.
timeout = 30<s>
string(timeout) == "30s"
Localized long form names for units can be used by setting long_form
to
true
. These default to the names given during unit definition, e.g.
string(1<s>, long_form: true) == "1 second" // true
string(30<s>, long_form: true) == "30 seconds" // true
When cast to a string, the most appropriate unit from a list can be
automatically selected by specifying closest_fit
. This will find the largest
unit that gives a positive integer quantity, e.g.
time_taken = time.since(start) // 251<s>
string(time_taken, closest_fit: [s, min, hour]) == "2 minutes" // true
The humanize
parameter automatically selects up to two of the largest units
that result in whole numbers, e.g.
string(
time.since(post.updated_time),
humanize: true,
limit_units_to: [s, min, hour, day, month, year]
)
// Outputs look something like:
// "2 minutes"
// "3 months and 10 days"
Types of a specific quantity can be referred to explicitly as quantity[unit]
,
e.g.
distance = 10<km>
type(distance) == quantity[km] // true
Some quantity types are aliased for convenience, e.g.
duration = quantity[s]
When parsing from a string to a quantity of a specific unit, it will be normalized from relatable units, e.g.
timeout = 30<s>
timeout = quantity("2min")
timeout == 120<s> // true
Custom asset
and currency
units can be defined at runtime with a symbol
and name
, e.g.
USDC = currency(symbol: "USDC", name: "USD Coin")
EURC = currency(symbol: "EURC", name: "EUR Coin")
Computations with currency units can be converted using live exchange rates at runtime, e.g.
fx_rate = 1.16<USDC/EURC>
total = 1000<EURC>
total.to(USDC, at: fx_rate) == 1160<USDC> // true
This allows type safety to be maintained, e.g. a GBP
value can’t be
accidentally added to a USD
value, while supporting explicit conversion, e.g.
gbp_subtotal = 200<GBP>
usd_subtotal = 100<USD>
total = gbp_subtotal + usd_subtotal // ERROR! Can't mix currencies
// This would work:
total = gbp_subtotal + usd_subtotal.to(GBP, at: fx_rate)
Range Values
Iroh provides a native range
type for constructing a sequence of values
between a given start
and end
integer values, e.g.
x = range(start: 5, end: 9)
len(x) == 5 // true
// Prints: 5, 6, 7, 8, 9
for i in x {
print(i)
}
A next
value can also be provided to deduce the “steps” to take between the
start and end, e.g.
x = range(start: 1, next: 3, end: 10)
// Prints: 1, 3, 5, 7, 9
for i in x {
print(i)
}
If next
is not specified, it defaults to start + 1
if the start
value is
less than or equal to end
, otherwise it defaults to start - 1
, e.g.
x = range(start: 5, end: 1)
// Prints: 5, 4, 3, 2, 1
for i in x {
print(i)
}
Likewise, if start
is not specified, it defaults to 0
, e.g.
x = range(end: 5)
// Prints: 0, 1, 2, 3, 4, 5
for i in x {
print(i)
}
Note that, unlike range
in Python, our ranges are inclusive of the end
value. We believe this is much more intuitive — especially when it comes to
using ranges to slice values.
Since ranges are frequently used, the shorthand start..end
syntax from Perl is
available, e.g.
x = 5..10
x == range(start: 5, end: 10) // true
This is particularly useful in for
loops, e.g.
// Prints: 5, 6, 7, 8, 9, 10
for i in 5..10 {
print(i)
}
Similar to Haskell, the next
value can also be specified in the shorthand as
start,next..end
, e.g.
// Prints: 5, 7, 9
for i in 5,7..10 {
print(i)
}
While range expressions do not support full sub-expressions, variable identifiers can be used in place of integer literals, e.g.
pos = 0
next = 2
finish = 6
// Prints: 0, 2, 4, 6
for i in pos,next..finish {
print(i)
}
When the start value is elided in range expressions, it defaults to 0
, e.g.
finish = 6
// Prints: 0, 1, 2, 3, 4, 5, 6
for i in ..finish {
print(i)
}
Iroh does not provide syntax for ranges that are exclusive of its end value, as
having two separate syntaxes tends to confuse developers, e.g. ..
and ...
in
Ruby, ..
and ..=
in Rust, etc.
Array Data Types
Iroh supports arrays, i.e. ordered collections of elements of the same type with a length that is determined at compile-time.
// A 5-item array of ints:
x = [5]int{1, 2, 3, 4, 5}
// A 3-item arrays of strings:
y = [3]string{"Tav", "Alice", "Zeno"}
Array types are of the form [N]Type
when N
specifies the length and Type
specifies the type of each element. All unspecified array elements are
zero-initialized, e.g.
x = [3]int{}
x[0] == 0 and x[1] == 0 and x[2] == 0 // true
Elements at specific indexes can also be explicitly specified for sparse initialization, e.g.
x = [3]int{2: 42}
x == {0, 0, 42} // true
The ;
delimiter can be used within the length specifier to initialize with a
custom value, e.g.
x = [3; 20]int{2: 42}
x == {20, 20, 42} // true
For more complex defaults, a fill function can also be specified. This function is passed the index positions for all dimensions that can be optionally used, e.g.
x = [3]int{} << { $0 * 2 }
x == {0, 2, 4} // true
Using a custom fill function together with either a default value or with arrays that have elements initialized at specific indexes will result in an error at edit time.
The length of an array can either be an integer literal or an expression that compile-time evaluates to an integer, e.g.
const n = 5
x = [n + 1]int{}
len(x) == 6 // true
The ...
syntax can be used to let the compiler automatically evaluate the
length of the array from the number of elements, e.g.
x = [...]int{1, 2, 3}
len(x) == 3 // true
The element type can also be elided when it can be automatically inferred, e.g.
x = [3]{"Zeno", "Reia", "Zaia"} // type is [3]string
y = [...]{"Zeno", "Reia", "Zaia"} // type is [3]string
Array elements are accessed via [index]
, which is 0-indexed, e.g.
x = [3]int{1, 2, 3}
x[1] == 2 // true
Indexed access is always bounds checked for safety. Out of bounds access will generate a compile-time error for index values that are known at compile-time, and a runtime error otherwise.
Elements of an array can be iterated using for
loops, e.g.
x = [3]int{1, 2, 3}
for elem in x {
print(elem)
}
If you want the index as well, destructure the iterator into 2 values, i.e.
x = [3]int{1, 2, 3}
for idx, elem in x {
print("Element at index ${idx} is: ${elem}")
}
Arrays can be compared for equality by using the ==
operator, which compares
each element of the array for equality, e.g.
x = [3]int{1, 2, 3}
if x == {1, 2, 3} {
// do something
}
Note that when the compiler can infer the type of something, e.g. on one side of a comparison where the type of the other side is already known, the type specification can be elided as above.
Array instances also have a bunch of utility methods, e.g.
-
x.contains
— checks if the array has a specific element. -
x.index_of
— returns the first matching index of a specific element if it exists. -
x.reversed
— returns a reversed copy of the array. -
x.sort
— sorts the array as long as the elements are sortable.
Depending on the size of the array and how it’s used, the compiler will automatically allocate the array on either the stack, heap, or even the registers.
By default, arrays are passed by value. If a function needs to mutate the array so that the changes are visible to the caller, it can specify the parameter type to be a pointer to an array, e.g.
func change_first_value(x *[3]int) {
x[0] = 21 // indexed access is automatically dereferenced
}
x = [3]int{1, 2, 3}
change_first_value(&x)
print(x) // [3]int{21, 2, 3}
If an array is defined as a const
, it can’t be mutated, e.g.
const x = [3]int{1, 2, 3}
x[0] = 5 // ERROR!
Arrays with elements of the same type can be cast between different sizes, e.g.
x = [3]int{1, 2, 3}
y = [5]int(x) // when upsizing, missing values are zero-initialized
z = [2]int(x) // when downsizing, excess values are truncated
Multi-dimensional arrays can also be specified by stacking the array types, e.g.
x = [2][3]int{
{1, 2, 3},
{4, 5, 6}
}
This can also use the slightly clearer MxN
syntax, e.g.
x = [2x3]int{
{1, 2, 3},
{4, 5, 6}
}
This can be extended to whatever depth is needed, e.g.
x = [2x3x4]int{}
y = [480x640x3]byte{}
Multi-dimensional arrays default to row-major layouts so as to be CPU cache-friendly. For use cases where a column-major layout is needed, this can be explicitly specified, e.g.
x = [2x3@col]int{}
This can also be set for an entire lexical scope using the with
statement, e.g.
with {
.array_layout = .column_major
} {
x = [2x3]int{}
}
Such array layouts can be useful in domains like linear algebra calculations. The layout can also be transposed dynamically to a new array via casting, e.g.
x = [2x3]int{}
y = @col(x)
z = @row(y) // same layout as x
Or by using @transpose
when you just want the opposite layout, e.g.
x = [2x3]int{}
y = @transpose(x)
z = @transpose(x) // same layout as x
Slice Data Types
Slices, like arrays, are used for handling ordered elements. Unlike arrays, which have a size that is fixed at compile-time, slices can be dynamically resized during runtime.
Slice types are of the form []Type
where Type
specifies the type of each
element. For example, a slice of ints can be initialized with:
x = []int{1, 2, 3}
Since slices are used often, a shorthand is available when the element type can be inferred, e.g.
x = [1, 2, 3]
Internally, slices point to a resizable array, keep track of their current length, i.e. the current number of elements, and specified capacity, i.e. allocated space:
slice = struct {
_data [*]array // pointer to an array
_length int
_capacity int
}
The built-in len
and cap
functions provide the current length and capacity
of a slice, e.g.
x = [1, 2, 3]
len(x) == 3 // true
cap(x) == 3 // true
Slices are dynamically resized as needed, e.g. if you append to a slice that is already at capacity:
x = [1, 2, 3]
x.append(4)
len(x) == 4 // true
cap(x) == 4 // true
When a slice’s capacity needs to grow, it is increased to the closest power of 2 that will fit the additional elements, e.g.
x = [1, 2, 3, 4]
x.append(5)
len(x) == 5 // true
cap(x) == 8 // true
As slices are just views into arrays, they can be formed by “slicing” arrays with a range, e.g.
x = [5]int{1, 2, 3, 4, 5}
y = x[0..2]
y == [1, 2, 3] // true
Slices can also be sliced to form new slices, e.g.
x = [1, 2, 3, 4, 5]
y = x[0..2]
y == [1, 2, 3] // true
When a slice needs to grow beyond the underlying capacity, a new array is allocated and existings values are copied over. This allows for multiple independent views, e.g.
arr = [4]int{1, 2, 3, 4}
x = arr[2..3] // [3, 4]
y = arr[..] // [1, 2, 3, 4]
// Changes to the underlying array are reflected in other views, e.g.
x[0] = 10
x == [10, 4] // true
y == [1, 2, 10, 4] // true
// But if a slice is grown, it no longer points to the same array, e.g.
x.append(5)
// So changes are not reflected in other views, e.g.
x[0] = 11
x == [11, 4, 5] // true
y == [1, 2, 10, 4] // true
Individual elements of a slice can be accessed by indexing, e.g.
x = [1, 2, 3]
x[0] == 1 and x[1] == 2 and x[2] == 3 // true
Similar to Python, negative indexes work as offsets from the end, e.g.
x = [1, 2, 3]
x[-1] == 3 // true
x[-2] == 2 // true
Slices can be iterated over by for
loops, e.g.
x = [1, 2, 3]
// Prints: 1, 2, 3
for elem in x {
print(elem)
}
If you want the index as well, destructure the iterator into 2 values, just like arrays, e.g.
x = [1, 2, 3]
for idx, elem in x {
print("Element at index ${idx} is: ${elem}")
}
Using square brackets around a range expression acts as a shorthand for creating a slice from that range, e.g.
x = [1..5]
len(x) == 5 // true
x == [1, 2, 3, 4, 5] // true
If an actual slice consisting of range
values is desired, then parantheses
need to be used, e.g.
x = [(1..5)]
len(x) == 1 // true
x[0] == range(start: 1, end: 5) // true
If the start value is left off by the range when slicing, it defaults to 0
, e.g.
x = [1, 2, 3, 4, 5]
y = x[..2]
y == [1, 2, 3] // true
If the end value is left off, it defaults to the index of the last element, i.e.
len(slice) - 1
, or if the start value is positive, then it defaults to the
negative index of the first value, e.g.
x = [1, 2, 3, 4, 5]
y = x[2..]
z = x[-3..]
y == [3, 4, 5] // true
z == [3, 2, 1] // true
If both start and end are left off, it slices the whole range, e.g.
x = [1, 2, 3, 4, 5]
y = x[..]
y == [1, 2, 3, 4, 5] // true
Slicing with stepped ranges let’s you get interleaved elements, e.g.
x = [10..20]
y = x[0,2..] // gets every other element
y == [10, 12, 14, 16, 18, 20] // true
When initializing a slice, if you already know the minimum capacity that’s needed, it can be specified ahead of time so as to avoid reallocations, e.g.
x = make([]int, cap: 100)
len(x) == 0 // true
cap(x) == 100 // true
The length can also be specified, e.g.
x = make([]int, cap: 100, len: 5)
len(x) == 5 // true
cap(x) == 100 // true
If length is specified, but not capacity, then capacity defaults to the same value as length, e.g.
x = make([]int, len: 5)
len(x) == 5 // true
cap(x) == 5 // true
By default, elements are initialized to their zero value. An alternative default can be specified if desired, e.g.
x = make([]int, len: 5, default: 20)
len(x) == 5 // true
cap(x) == 5 // true
x[0] == 20 // true
x[4] == 20 // true
Slices have a broad range of utility methods, e.g.
-
x.append
— adds one or more elements to the end. -
x.all
— returns true if all elements match the given predicate function. -
x.any
— returns true if any element matches the given predicate function. -
x.choose
— selectn
number of elements from the slice. -
x.chunk
— split the slice into sub-slices of the specified length. -
x.clear
— removes all elements. -
x.combinations
— generate all possible combinations of the given length from the slice elements. -
x.contains
— checks if the slice has a specific element. -
x.count
— returns the number of elements that match the given predicate function. -
x.drop
— remove the firstn
elements. -
x.drop_while
— keep removing elements until the given predicate function fails. -
x.enumerated
— returns a new slice where each element gives both the original element index and value. -
x.extend
— add all elements from the given slice to the end. -
x.filter
— return a new slice for elements matching the given predicate function. -
x.filter_map
— applies a function to each element and keeps only the non-nil
results as a new slice. -
x.first
— return the first element, if any; also takes an optional predicate function, in which case, the first element, if any, that matches that predicate will be returned. -
x.flatten
— flatten nested slices. -
x.flat_map
— transform each element with the given function and then flatten nested slices. -
x.for_each
— run a given function for each element in the slice. -
x.group_by
— group elements by the given key function. -
x.index_of
— returns the first matching index of a specific element if it exists. -
x.insert_at
— inserts one or more elements at a given index. -
x.intersperse
— insert a separator between all elements. -
x.join
— join all elements into a string with the given separator. -
x.last
— return the last element, if any. -
x.last_index_of
— returns the last matching index of a specific element if it exists. -
x.map
— return a new slice with each element transformed by a function. -
x.max
— returns the maximum element; a custom comparison function can be given when the elements are non-comparable. -
x.min
— returns the minimum element; a custom comparison function can be given when the elements are non-comparable. -
x.partition
— split into two slices based on a predicate function. -
x.permutations
— generate all possible permutations of the given length from the slice elements. -
x.pop
— removes the last element from the slice and returns it. -
x.prepend
— adds one or more elements to the start. -
x.reduce
— reduce the elements into one value using a given accumulator function and initial value. -
x.remove_at
— remove one or more elements at a given index. -
x.remove_if
— remove elements matching the given predicate function. -
x.reverse
— reverses the slice in-place. -
x.reversed
— returns a reversed copy of the slice. -
x.scan
— starting with an initial value and an accumulator function, keep applying to each element, and return a slice of all intermediary results. -
x.shift
— removes the first element from the slice and returns it. -
x.shuffle
— shuffles the slice in-place. -
x.shuffled
— returns a shuffled copy of the slice. -
x.sort
— sorts the slice in-place. -
x.sorted
— returns a sorted copy of the slice. -
x.split_at
— split into two separate slices at the given index. -
x.sum
— for slices of elements that support the+
operator, this returns the sum of all elements. -
x.take
— return the firstn
elements. -
x.take_while
— return all elements up to the point that the given predicate function fails. -
x.transpose
— swap rows/columns for a slice of slices. -
x.unique
— remove any duplicate elements. -
x.unzip
— convert a slice of paired elements into two separate slices. -
x.window
— create sub-slices that form a sliding window of the given length. -
x.zip
— pair each element with elements from the given slice. -
x.zip_with
— pair each element with elements from the given slice, and apply the given transformation function to each pair.
Methods like filter
and map
can use a Swift-like closure syntax for defining
their function parameters, e.g.
adult_names = people
.filter { $0.age > 18 }
.map { $0.name.upper() }
Likewise for sort
, e.g.
people.sort { $0.age < $1.age }
// Or, if you want a stable sort:
people.sort(stable: true) { $0.age < $1.age }
Iroh fully inlines these functions, so calls like people.filter { $0.age > 18}
performs exactly like a hand-written for loop, while being much more
readable.
When combined with range initialization, the whole sequence effectively becomes lazy, minimizing unnecessary allocation, e.g.
result = [1..1000]
.filter { $0 % 2 == 0 }
.map { $0 * 2 }
.take(10)
Methods like append
and insert_at
support both inserting one element at a
time, e.g.
x = [1..3]
x.append(10)
x == [1, 2, 3, 10] // true
As well as appending multiple elements as once, e.g.
x = [1..3]
x.append(4, 5, 6)
x == [1, 2, 3, 4, 5, 6] // true
While all elements from another slice can be appended by using the ...
splat
operator, e.g.
x = [1..3]
y = [5..7]
x.append(4, ...y)
x == [1, 2, 3, 4, 5, 6, 7] // true
The extend
method should be used instead for most use cases, e.g.
x = [1..3]
y = [4..7]
x.extend(y)
x == [1, 2, 3, 4, 5, 6, 7] // true
When a slice is assigned to a new variable or passed in as a parameter, it is to the same reference, e.g.
x = [1..5]
y = x
y[0] = 20
x == [20, 2, 3, 4, 5] // true
y == [20, 2, 3, 4, 5] // true
The copy
method needs to be used if it should be to an independent slice, e.g.
x = [1..5]
y = x.copy()
y[0] = 20
x == [1, 2, 3, 4, 5] // true
y == [20, 2, 3, 4, 5] // true
When copying slices where the elements are of a composite type, using the
deep_copy
method will recursively call copy on all sub-elements that are
deep-copyable, e.g.
x = [
{"name": "Tav", "surname": "Siva"},
{"name": "Alice", "surname": "Fung"}
]
// Deep copies are independent of each other.
y = x.deep_copy()
y[1]["surname"] = "Siva"
x[1]["surname"] == "Fung" // true
y[1]["surname"] == "Siva" // true
// However, standard copying is shallow.
z = x.copy()
z[1]["surname"] = "Fung"
x[1]["surname"] == "Fung" // true
z[1]["surname"] == "Fung" // true
Slices can be compared for equality by using the ==
operator, which compares
each element of the slice for equality, e.g.
x = [1, 2, 3]
if x == [1, 2, 3] {
// do something
}
Element-wise comparisons use dotted operators like MATLAB and Julia. When
comparison operators like .==
and .>
are used, they produce boolean slices,
e.g.
x = [1..5]
x .== 2 // [false, true, false, false, false]
x .> 3 // [false, false, false, true, true]
These can then be used to do boolean/logical indexing, i.e. returning the slice
values matching the true
values, as first introduced by MATLAB and made
popular by NumPy, e.g.
x = [1..10]
y = x[x .> 5]
y == [6, 7, 8, 9, 10] // true
Full broadcasting is supported, where operations are applied element-wise to arrays, slices, and collections of different shapes and sizes, e.g.
x = [0..5]
y = [10..15]
z = x .+ y
z == [10, 12, 14, 16, 18, 20] // true
These operations will be automatically vectorized on architectures that support vectorization, e.g. SIMD on CPUs, GPUs, etc. This will happen transparently without needing custom hints.
Iroh supports Julia’s dotted function mechanism for broadcasting the function, i.e. applying it element-wise for each item in the input, e.g.
add = (n) => n * 2
x = [0..5]
y = add.(x)
y == [0, 2, 4, 6, 8, 10] // true
This also works with Iroh’s standard library, which provides functions like
sin
and exp
, linear algebra functions like inv
and solve
, stats
functions like mean
and std
, axis-aware reductions, etc.
Multi-dimensional slices can be created by nesting slice types, e.g.
x = [][]int{
{1, 2, 3},
{4, 5, 6}
}
Where the element type can be inferred, the shorthand syntax can be used, e.g.
x = [[1, 2, 3], [4, 5, 6]]
Or use ;
to make it even shorter, e.g.
x = [1, 2, 3; 4, 5, 6]
x[0] == [1, 2, 3] // true
x[1] == [4, 5, 6] // true
Multiple ;
delimiters can be used to make more nested structures, e.g.
x = [1, 2; 3, 4;; 5, 6; 7, 8]
len(x) == 2 // true
len(x[0]) == 2 // true
x[0][0] == 1 // true
Custom dimensions can also be specified to the constructor, e.g.
x = [2, 3]int{}
x == [0, 0, 0; 0, 0, 0] // true
By default, such multi-dimensional slices are zero-initialized. A different
default can be specified using ;
, e.g.
x = [2, 3; 7]int{}
x == [7, 7, 7; 7, 7, 7] // true
Custom fill function can also be specified. This function is passed the index positions for all dimensions. This can be used, if needed, to derive the fill value, e.g.
x = [2, 3]int{} << { $0 + $1 }
x == [0, 1, 2; 1, 2, 3] // true
The default row-major ordering can be overriden with @col
, e.g.
x = [2, 3@col]int{}
Or as explicit parameters to make
, e.g.
x = make([]int, shape: (2, 3), layout: .column_major, default: 7)
Existing arrays/slices can be reshaped easily, e.g.
x = [0..5]
// Cast it to a different shape to reshape it:
y = [2, 3]int(x)
// Or use the fluent syntax within method chains:
z = x.reshape(2, 3)
y == [0, 1, 2; 3, 4, 5] // true
y == z // true
Casting to a new slice will also do type conversion on the elements if the two slices have different element types. Any optional parameters will be passed along for the type conversion, e.g.
x = []int32{1, 2, 3}
// No parameters needed as upcasting is safe:
y = []int64(x)
// Specify type conversion parameter for downcasting, e.g.
z = []int8(x, policy: truncate)
Standard arithmetic operations on slices tends to do matrix operations, e.g.
x = [1, 2; 3, 4]
y = [5, 6; 7, 8]
// Matrix multiplication:
z = x * y
z == [19, 22; 43, 50] // true
Slices can be indexed using other slices to get the elements at the given indexes, e.g.
x = [1, 2, 3, 4, 5]
y = x[[0, 2]]
y == [1, 3] // true
Multi-dimensional slices can also be sliced using ;
, e.g.
x = [1, 2, 3; 4, 5, 6; 7, 8, 9]
x[..; 1] == [2, 5, 8] // Column/Dimension slice.
x[..1; 1..2] == [2, 3; 5, 6] // Range slice.
x[0,2..; -1..] == [3, 2, 1; 9, 8, 7] // Step slice.
x[[0, 2]; [1, 0]] == [2, 7] // Index slice.
Slice ranges can also be assigned to, e.g.
x = [1..5]
x[..2] = [7, 8, 9]
x == [7, 8, 9, 4, 5] // true
Likewise for multi-dimensional slices, e.g.
x = [1, 2, 3; 4, 5, 6; 7, 8, 9]
x[..; 1] = [23, 24, 25]
x == [1, 23, 3; 4, 24, 6; 7, 25, 9] // true
The .=
broadcasting assignment operator can be used to assign a value to each
element in a range, e.g.
x = [1..5]
x[..2] .= 9
x == [9, 9, 9, 4, 5] // true
Slices support a broad range of operations that make it easy in domains like linear algebra, machine learning, physics, etc.
c = a * b // Matrix/vector multiplication.
c = a · b // Inner/dot product.
c = a ⊗ b // Tensor product (Outer product for vectors; Kronecker product for matrices).
c = a × b // Cross product.
The operations work as one would expect when they are applied to different types, e.g. multiplying by a scalar value will automatically broadcast, use matrix-vector multiplication when needed, etc.
Slices also support matrix norms and determinants with similar syntax to what’s used in mathematics, e.g.
|a| // Determinant for square matrices
||a|| // Vector: 2-norm, Matrix: Frobenius norm
||a||_0 // Vector/Matrix: Count non-zeros
||a||_1 // Vector: 1-norm, Matrix: 1-norm
||a||_2 // Vector: 2-norm, Matrix: spectral norm
||a||_inf // Vector: max norm, Matrix: infinity norm
||a||_max // Vector: max norm, Matrix: max absolute element
||a||_nuc // Matrix: nuclear norm
||a||_p // Vector: p-norm, Matrix: Schatten p-norm
||a||_quad // Vector: quadratic norm
||a||_w // Vector: weighted norm, Matrix: weighted Frobenius
Subscripts with {i,j,k,l}
syntax can be used on slices for operations in
Einstein notation, e.g.
hidden{i,k} = input{i,j} * weights1{j,k} + bias1
output{i,l} = hidden{i,k} * weights2{k,l} + bias2
This is more readable than how frameworks like Numpy handle it, i.e.
hidden = np.einsum("ij,jk->ik", input, weights1) + bias1
output = np.einsum("ij,jk->ik", hidden, weights2) + bias2
This also simplifies the need for a number of functions, e.g. instead of having
a trace
function, diagonal elements can be easily summed with:
x = [1, 2, 3; 4, 5, 6; 7, 8, 9]
trace = x{i,i}
trace == 15 // true
Iroh’s approach also results in higher quality code, e.g. dimension mismatches can be caught at compile-time, the compiler can automatically optimize for the best operation order, etc.
String Data Types
Iroh defaults to Unigraph text as it fixes many of the shortcomings of Unicode. Two specific Unigraph types are provided:
-
string
— the default type that’s suitable for most use cases. -
composable_string
— a special type that includes the separately namespaced Unigraph Compose Element IDs that’s useful for building text editors.
Both data types are encoded in UEF (Unigraph Encoding Format). But, where
string
only consists of Unigraph IDs, composable_string
can include both
Unigraph IDs and Compose Element IDs.
A number of string types are also provided for dealing with Unicode text:
-
utf8_string
— UTF-8 encoded text. -
wtf8_string
— WTF-8 for roundtripping between UTF-8 and broken UTF-16, e.g. on Windows where you can get unpaired surrogates in filenames.
In addition, the encoded_string
type takes specific encodings, e.g.
x = encoded_string("Hello", .iso_8859_1)
The supported encodings include:
enum {
ascii, // ASCII
big5, // Big 5
cesu_8, // CESU-8
cp037, // IBM Code Page 37
cp437, // IBM Code Page 437
cp500, // IBM Code Page 500
cp850, // IBM Code Page 850
cp866, // IBM Code Page 866
cp1026, // IBM Code Page 1026
cp1047, // IBM Code Page 1047
cp1361, // IBM Code Page 1361
custom, // User-defined
euc_jp, // EUC-JP
euc_kr, // EUC-KR
gb2312, // GB2312
gb18030, // GB18030
gbk, // GBK
hz_gb2312, // HZ-GB2312
iscii, // ISCII
iso_2022_jp, // ISO-2022-JP
iso_2022_kr, // ISO-2022-KR
iso_8859_1, // ISO-8859-1
iso_8859_2, // ISO-8859-2
iso_8859_3, // ISO-8859-3
iso_8859_4, // ISO-8859-4
iso_8859_5, // ISO-8859-5
iso_8859_6, // ISO-8859-6
iso_8859_7, // ISO-8859-7
iso_8859_8, // ISO-8859-8
iso_8859_8_i, // ISO-8859-8-I
iso_8859_9, // ISO-8859-9
iso_8859_10, // ISO-8859-10
iso_8859_11, // ISO-8859-11
iso_8859_13, // ISO-8859-13
iso_8859_14, // ISO-8859-14
iso_8859_15, // ISO-8859-15
iso_8859_16, // ISO-8859-16
johab, // Johab (KS C 5601-1992)
koi8_r, // KOI8-R
koi8_u, // KOI8-U
mac_cyrillic, // Mac OS Cyrillic
mac_greek, // Mac OS Greek
mac_hebrew, // Mac OS Hebrew
mac_roman, // Mac OS Roman
mac_thai, // Mac OS Thai
mac_turkish, // Mac OS Turkish
shift_jis, // Shift_JIS
tis_620, // TIS-620
tscii, // TSCII
ucs_2, // UCS-2
ucs_4, // UCS-4
uef, // UEF
utf_7, // UTF-7
utf_8, // UTF-8
utf_16, // UTF-16
utf_16be, // UTF-16BE
utf_16le, // UTF-16LE
utf_32, // UTF-32
utf_32be, // UTF-32BE
utf_32le, // UTF-32LE
viscii, // VISCII
windows_874, // Windows-874
windows_1250, // Windows-1250
windows_1251, // Windows-1251
windows_1252, // Windows-1252
windows_1253, // Windows-1253
windows_1254, // Windows-1254
windows_1255, // Windows-1255
windows_1256, // Windows-1256
windows_1257, // Windows-1257
windows_1258, // Windows-1258
wtf_8, // WTF-8
}
Any other encoding can be used by specifying a .custom
encoding and providing
an encoder that implements the built-in string_encoder
interface, e.g.
x = encoded_string(text, .custom, encoder: scsu)
String literals default to the string
type and are UEF-encoded. They are
constructed using double quoted text, e.g.
x = "Hello world!"
For convenience, pressing backslash, i.e. \
, in the Iroh editor will provide a
mechanism for entering characters that may otherwise be difficult to type, e.g.
Sequence | Definition |
---|---|
\a |
ASCII Alert/Bell |
\b |
ASCII Backspace |
\f |
ASCII Form Feed |
\n |
ASCII Newline |
\r |
ASCII Carriage Return |
\t |
ASCII Horizontal Tab |
\v |
ASCII Vertical Tab |
\\ |
ASCII Backslash |
\x + NN |
Hex Bytes |
\u + NNNN |
Unicode Codepoint |
\u{ + N... + } |
Unigraph ID |
\U + NNNNNNNN |
Unicode Codepoint |
As the Iroh editor automatically converts text into bytes as they are typed, this is purely a convenience for typing and rendering the strings. No escaping is needed internally.
String values can be cast between each type easily, e.g.
x = utf8_string("Hello world!")
y = encoded_string(x, .windows_1252)
z = string(y)
x == z // true
For safety, Unicode text is parsed in .strict
mode by default, i.e. invalid
characters generate an error. But this can be changed if needed, e.g.
// Skip characters that can't be encoded/decoded:
x = utf8_string(value, errors: .ignore)
// Replace invalid characters with the U+FFFD replacement character:
y = utf8_string(value, errors: .replace)
Iroh also provides a null-terminated c_string
that is compatible with C. This
can be safely converted between the other types, e.g.
// This c_path value is automatically null-terminated when
// passed to C functions. The memory management for the
// value is also automatically handled by Iroh.
c_path = c_string("/tmp/file.txt")
// When converted back to any of the other string types,
// the null termination is automatically removed.
path = string(c_path)
path == "/tmp/file.txt" // true
The c_string
type will generate errors when strings that are being converted
have embedded nulls in them. These need to be manually handled with try
, e.g.
c_path = try c_string("/tmp/f\x00ile.txt")
The length of all string types can be found by calling len
on it, e.g.
x = "Hello world"
len(x) == 11 // true
Calling len
returns what most people would expect, e.g.
x = utf8_string("🤦🏼♂️")
len(x) == 1 // true
Except for C strings, where it returns the number of bytes, Iroh always defines the length of a string as the number of visually distinct graphemes:
-
For Unicode strings, this is the number of extended grapheme clusters. Note that Unicode’s definition of grapheme clusters can change between different versions of the standard.
-
For Unigraph strings, this is the number of Unigraph IDs which are already grapheme based. Unlike with Unicode strings, this number will never change with new versions of the standard.
Strings can be concatenated by using the +
operator, e.g.
x = "Hello"
y = " world"
z = x + y
z == "Hello world" // true
Multiplying strings with an integer using the *
operator will duplicate the
string that number of times, e.g.
x = "a"
y = a * 5
y == "aaaaa" // true
Individual graphemes can be accessed from a string using the []
index
operator, e.g.
x = "Hello world"
y = x[0]
y == 'H' // true
Unigraph graphemes are represented by the grapheme
type, and Unicode ones by
the unicode_grapheme
type. Grapheme literals are quoted within single quotes,
e.g.
char = '🤦🏼♂️'
When graphemes are added or multiplied, they produce strings, e.g.
x = 'G' + 'B'
x == "GB" // true
Strings can also be sliced similar to slice types, e.g.
x = "Hello world"
y = x[..4]
y == "Hello" // true
All string types support the typical set of methods, e.g.
-
x.ends_with
— checks if a string ends with a suffix. -
x.index_of
— finds the position of a substring. -
x.join
— join a slice of strings with the value ofx
. -
x.lower
— lower case the string. -
x.pad_end
— pad the end of a string -
x.pad_start
— pad the start of a string -
x.replace
— replace all occurences of a substring. -
x.split
— split the string on the given substring or grapheme. -
x.starts_with
— checks if a string starts with a prefix. -
x.trim
— remove whitespace from both ends of a string. -
x.upper
— upper case the string.
For safety, all string values are immutable. They can be cast to a []byte
or
[]grapheme
to explicitly mutate individual bytes or graphemes, e.g.
x = "Hello world!"
y = []grapheme(x)
y[-1] = '.'
string(y) == "Hello world." // true
As a convenience, Iroh will automatically cast between strings and grapheme slices when values are passed in as parameters. The compiler will also try to minimize unnecessary allocations.
To control how values are formatted, we use explicit parameters, e.g.
x = 255
y = string(x, format: .hex_prefixed)
y == "0xff" // true
Unlike the cryptic format specifiers used in languages like Python, e.g.
amount = 9876.5432
y = f"{amount:>10,.2f}"
y == " 9,876.54" # true
We believe this is a lot clearer, i.e.
amount = 9876.5432
y = string(amount, scale: 2, thousands_separator: ",", width: 10, align: .right)
y == " 9,876.54" // true
A range of format specifiers are available, e.g. quoted strings can be generated
by specifying the optional quote
parameter:
x = string("Tav", quote: true)
y = string(42, quote: true)
z = string("Hello \"World\"", quote: true)
print(x) // Outputs: "Tav"
print(y) // Outputs: "42"
print(z) // Outputs: "Hello \"World\""
Other format parameters include:
-
casing
— change the casing of a string value tosnake_case
,kebab-case
,camelCase
,PascalCase
, etc. -
format
— to encode into different formats likebase64
,hex
, etc. -
indent
— to control the amount and characters used for indentation. -
pad_with
— pad the output with a given grapheme up to a givenwidth
. -
truncate
— truncate a string to a given width, with additional options to control how it gets truncated, e.g. at word boundaries, with a trailing ellipsis, etc.
They can also be provided as parameters on the format
method on string values,
e.g.
x = "Tav"
y = x.format(quote: true)
print(y) // Outputs: "Tav"
Sub-expressions can be interpolated into strings using 3 different forms:
-
${expr}
— evaluates the expression and converts the result into a string. -
={expr}
— same as above, but also prefixes the result with the original expression followed by the=
sign. -
#{expr}
— validates the expression, but instead of evaluating it, passes through the “parsed” syntax.
Interpolation using ${expr}
works as one would expect, e.g.
name = "Tav"
print("Hello ${name}!") // Outputs: Hello Tav!
It supports any Iroh expression, e.g.
name = "Tav"
print("Hello ${name.filter { $0 != "T" }}!") // Outputs: Hello av!
Interpolated expressions are wrapped within string( ... )
calls, so any
formatting parameters passed to string
can be specified after a comma, e.g.
amount = 9876.5432
print("${amount, scale: 2, thousands_separator: ","}") // Outputs: 9,876.54
Interpolation using ={expr}
is useful for debugging as it also prints the
expression being evaluated, e.g.
x = 100
print("={x + x}") // Outputs: x + x = 200
Interpolation using #{expr}
is only valid within template
strings, where it
allows for domain-specific evaluation of Iroh expressions, e.g.
time.now().format("#{weekday_short}, #{day} #{month_full} #{year4}")
// Outputs: Tue, 5 August 2025
Any function that takes a template
value can be used to construct tagged
template literals. These functions evaluate literals at compile-time for optimal
performance, e.g.
content = html`<!doctype html>
<body>
<div>Hello!</div>
</body>`
Finally, the []byte
and string
types support from
and to
methods to
convert between different binary-to-text formats, e.g.
x = "Hello world"
y = x.to(.hex)
z = y.from(.hex)
y == "48656c6c6f20776f726c64" // true
z == "Hello world" // true
The supported formats include:
enum {
ascii85, // Ascii85 (Adobe variant)
ascii85_btoa, // Ascii85 (btoa variant)
base2, // Base2 (binary string as text)
base16, // Base16 (alias of hex)
base32, // Base32 (RFC 4648, padded)
base32_unpadded, // Base32 (RFC 4648, unpadded)
base32_crockford, // Base32 (Crockford, padded)
base32_crockford_unpadded, // Base32 (Crockford, unpadded)
base32_hex, // Base32 (RFC 4648, extended hex alphabet, padded)
base32_hex_unpadded, // Base32 (RFC 4648, extended hex alphabet, unpadded)
base36, // Base36 (0-9, A-Z)
base45, // Base45 (RFC 9285, QR contexts)
base58_bitcoin, // Base58 (Bitcoin alphabet)
base58_check, // Base58 with 4-byte double-SHA256 checksum (no version)
base58_flickr, // Base58 (Flickr alphabet)
base58_ripple, // Base58 (Ripple alphabet)
base62, // Base62 (URL shorteners, compact IDs)
base64, // Base64 (RFC 4648, padded)
base64_unpadded, // Base64 (RFC 4648, unpadded)
base64_urlsafe, // Base64 (RFC 4648, URL-safe, padded)
base64_urlsafe_unpadded, // Base64 (RFC 4648, URL-safe, unpadded)
base64_wrapped_64, // Base64 (64-char wrapping, no headers)
base64_wrapped_76, // Base64 (76-char wrapping, no headers)
base85_rfc1924, // Base85 (RFC 1924 for IPv6 addresses)
base91, // Base91 (efficient binary-to-text)
base122, // Base122 (efficient binary-to-utf8-codepoints)
binary, // Binary string (alias of base2)
binary_prefixed, // Binary string (0b prefix)
hex, // Hexadecimal (lowercase)
hex_prefixed, // Hexadecimal (lowercase, 0x prefix)
hex_upper, // Hexadecimal (uppercase)
hex_upper_prefixed // Hexadecimal (uppercase, 0x prefix)
hex_eip55, // Hexadecimal (EIP-55 mixed-case checksum)
html_attr_escape, // Escape HTML attribute values
html_escape, // Escape HTML text
json_escape, // JSON string escaping
modhex, // ModHex (YubiKey alphabet)
percent, // Percent/URL encode
percent_component, // Percent/URL encode (RFC 3986, component-safe set)
punycode, // Punycode (internationalized domains)
quoted_printable, // Quoted-printable (email)
rot13, // ROT13 letter substitution
rot47, // ROT47 ASCII substitution
uuencode, // UUEncoding (data body only, fixed line length)
xml_attr_escape, // Escape XML attribute values
xml_escape, // Escape XML character data
xxencode, // XXEncoding (data body only, fixed line length)
yenc, // yEnc (Usenet binary)
z85, // Z85 (ZeroMQ variant)
z_base32, // Base32 (Zooko, z-base-32 human-friendly variant)
}
Optionals
Iroh supports optional
types which can either be nil
or a value of a
specific type, e.g.
x = optional[string](nil)
x == nil // true
x = "Hello"
x == "Hello" // true
A shorthand ?type
syntax is available too and defaults to nil
if no value is
provided, e.g.
x = ?string()
x == nil // true
x = "Hello"
x == "Hello" // true
For the following examples, let’s assume an optional name
field within a
Person
struct, e.g.
Person = struct {
name ?string
}
Optionals default to nil
, e.g.
person = Person{}
person.name == nil // true
Values can be easily assigned, e.g.
person = Person{}
person.name = "Alice"
When assigning nil
to nested optional types, it either needs to be explicitly
disambiguated, or the nil
value applies to the top-most level, e.g.
x = ??string("Test")
x = ?string(nil) // Inner optional
x = nil // Outer optional
Optional values can be explicitly unwrapped by using the !
operator, e.g.
name = person.name!
type(name) == string // true
Unwrapping a nil
value will generate an error. To safely unwrap, conditionals
with the ?
operator can be used, e.g.
if person.name? {
print(person.name.upper()) // person.name has been unwrapped into a string here
} else {
// Handle the case where person.name is nil
}
As Iroh automatically narrows types based on conditionals with the ?
operator,
and explicit unwraps with the !
operator, it makes the usage of optionals both
simple and safe.
Variables and fields which were initially typed as an optional retain this option even if they were type narrowed via an unwrap, e.g.
if person.name? {
type(person.name) == string // true
person.name = nil // Valid assignment.
}
Multiple optionals can be unwrapped in the same conditional, e.g.
if person.name? and vip_list? {
// Both person.name and vip_list are unwrapped here.
}
Optional lookups can be chained, e.g.
display_name = person.name?.upper()
type(display_name) == ?string // true
The value at the end of an optional chain is always an optional. The ??
operator can be used to provide a default if the left side is nil
, e.g.
display_name = person.name?.upper() ?? "ANONYMOUS"
type(display_name) == string // true
Assignments can use ?=
to only assign to a variable or field if it’s not
nil
, e.g.
person.name ?= "Alice"
Optionals work wherever an expression returns a value, e.g. after map lookups:
users[id]?.update_last_seen()
And can be used wherever types are accepted, e.g. in function parameters:
func greet(name ?string) {
print("Hello ${name ?? "there"}!")
}
When comparisons are made to a typed value, optionals are automatically unwrapped as necessary, e.g.
// The person.name value is implicitly unwrapped safely
// before the comparison:
if person.name == "Alice" {
// person.name has been unwrapped to a string with
// value "Alice"
} else {
// person.name can be either nil or a string value
// other than "Alice"
}
The various facets of our optionals system eliminates the issues caused by nil pointers, while maintaining a clean, readable syntax with predictable behaviour.
Map & Set Data Types
Iroh’s map
data types provide support for managing key/value data within hash
tables. Map types are generic over their key and value type, e.g.
profit = map[string]fixed128{
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
Where the types can be inferred, map values can be constructed using a {}
literal syntax, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
Duplicate keys in map literals will cause an edit-time error, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
"Consumer": 2239.50, // ERROR!
}
As types cannot be inferred from an empty {}
map literal, empty maps can only
be initialized with their explicit type, e.g.
profit = {} // ERROR!
profit = map[int]fixed128{} // Valid.
Maps are transparently resized as keys are added to them. To minimize the amount of resizing, maps can be initialized with an initial capacity, e.g.
// This map will only get resized once there are more
// than 5,000 keys in it:
profit = make(map[int]fixed128, cap: 5000)
Keys can be set to a value by assigning using the []
index operator, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
profit["Developer"] = 9368.30
len(profit) == 3 // true
Values assigned to a key can also be retrieved using the []
operator, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
consumer_profit = profit["Consumer"]
consumer_profit == 1739.50 // true
Assigning a value to a key will automatically overwrite any previous value
associated with that key. To only assign a value if the key doesn’t exist, the
set_if_missing
method can be used, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
// No effect as key exists.
profit.set_if_missing("Consumer", 0.0)
profit["Consumer"] == 1739.50
// Value set for key as it doesn't already exist.
profit.set_if_missing("Developer", 9368.30)
profit["Developer"] == 9368.30
When values are retrieved from a map using the []
operator, the zero value is
returned if the key has not been set, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
dev_profit = profit["Developer"]
dev_profit == 0.0 // true
If an alternative default value is desired, this can be specified using the
default
parameter when calling make
, e.g.
seen = make(map[string]bool, default: true)
tav_seen = seen["Tav"]
tav_seen == true // true
A custom initializer function can also be specified that derives a default value from the key, e.g.
// Users are fetched from the DB on first lookup.
user_cache = make(map[int]User, default: { db.get_user(id: $0) })
To check whether a key exists in a map, the in
operator can be used, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
consumer_profit_exists = "Consumer" in profit
consumer_profit_exists == true // true
if "Developer" in x {
// As "Developer" has not been set in the map, this code
// block will not execute.
}
To safely access a key’s value, the get
method can be used. This will return
nil
for non-existent keys, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
consumer_profit = profit.get("Consumer")
consumer_profit == 1739.50 // true
dev_profit = profit.get("Developer")
dev_profit == nil // true
To minimize any confusions caused by nested optionals being returned by the
get
method, maps do not support using optional types as keys, e.g.
lookup = map[?string]int{} // ERROR!
The get
method can be passed an optional default
parameter that will be
returned if the given key is not found, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
dev_profit = profit.get("Developer", default: 0.0)
dev_profit == 0.0 // true
Keys can be removed using the delete
method, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
profit.delete("Consumer")
len(profit) == 1 // true
Maps support equality checks, e.g.
x = {1: true}
y = {1: true}
x == y // true
Maps can be iterated using for
loops, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
for key in profit {
print(key) // Outputs: Consumer, Enterprise
}
For safety, iteration order is non-deterministic in all execution modes except
onchain-script
, where it is deterministic and based on the transaction hash.
By default, iteration will return the map keys. To retrieve both the keys and values, the 2-variable iteration form can be used, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
for key, value in profit {
print("${key} = ${value}") // Outputs: Consumer = 1739.50, Enterprise = 4012.80
}
If just the values are desired, then the values
method can be called, e.g.
profit = {
"Consumer": 1739.50,
"Enterprise": 4012.80,
}
for value in profit.values() {
print(value) // Outputs: 1739.50, 4012.80
}
Maps provide a number of other utility methods, e.g.
-
clear
— removes all entries from the map. -
compute
— update the specified key by applying the given function on the current value. -
copy
— create a copy of the map. -
deep_copy
— create a deep copy of the map. -
entries
— returns a slice of key/value pairs. -
filter
— create a new map with key/value pairs matching a given predicate function. -
filter_map
— apply a function to each key/value pair and keep only the non-nil results as a new map. -
find
— find a key/value pair matching the given predicate function. -
get_or_insert
— get value for the given key, or insert the given default value if the key has not been set. -
group_by
— group entries into nested maps. -
invert
— create a new map with the keys and values swapped. -
keys
— returns a slice of just the keys. -
map_values
— transform all values according to the given transform function. -
merge
— merge another map into this one. -
pop
— delete the given key and return its value if it exists.
Any hashable type can be used as the key type within a map. Most built-ins like strings and numbers are hashable. User-defined types are automatically hashable if all of their fields are hashable.
If a user-defined type is unhashable, it can implement a __hash__
method to
make it hashable. This can also be useful to hash it on a unique value, e.g. a
primary key.
These methods can make use of the built-in hash
function that is used to hash
built-in types, e.g.
Person = struct {
id int64
first_name string
last_name string
}
func (p Person) __hash__() int64 {
return hash(p.id)
}
// Person can now be used as a key within map types, e.g.
balances = map[Person]fixed128{}
Certain built-in types like slices and sets are not hashable as they are
mutable. To use them as keys, then they will first need to be converted into
immutable const
values, e.g.
accessed = map[![]string]int{} // The ! in a type spec states that it is const
file_md = const ["path", "to", "file.md"]
accessed[file_md] = 1
The orderedmap
data type behaves exactly like a map
except that it keeps
track of insertion order and has deterministic iteration order, e.g.
fruits = orderedmap{
"apple": 10,
}
fruits["orange"] = 5
fruits["banana"] = 20
for fruit in fruits {
print(fruit) // Outputs: apple, orange, banana
}
Iroh also supports set
data types for unordered collections without any
repeated elements, e.g.
x = set[int]{1, 2, 3}
Where the type can be inferred, sets can be constructed using a {}
literal
syntax, e.g.
x = {1, 2, 3}
Duplicates in set literals will cause an edit-time error, e.g.
x = {1, 2, 3, 2} // ERROR!
Like with maps, empty sets can only be initialized with their explicit type, e.g.
x = {} // ERROR!
x = set[int]{} // Valid.
Elements can be added to a set using the add
method, e.g.
x = {1, 2, 3}
x.add(4)
len(x) == 4 // true
Elements can be removed using the remove
method, e.g.
x = {1, 2, 3}
x.remove(3)
len(x) == 2 // true
Sets can be iterated using for
loops, e.g.
x = {1, 2, 3}
for elem in x {
print(x) // Outputs: 1, 2, 3 in some order
}
Like with maps, the iteration order of sets is non-deterministic in all modes
except onchain-script
where it will be deterministic and based on the
transaction state.
Checking if an element exists in a set can be done using the in
keyword, e.g.
x = {1, 2, 3}
y = 3 in x
y == true // true
if 4 in x {
// As 4 is not in the set, this code block will not execute.
}
Sets can be combined using the typical operations, e.g.
x = {1, 2, 3, 4}
y = {3, 4, 5, 6}
x.union(y) // {1, 2, 3, 4, 5, 6}
x.intersection(y) // {3, 4}
x.difference(y) // {1, 2}
x.symmetric_difference(y) // {1, 2, 5, 6}
For brevity, the equivalent operators can also be used, e.g.
x = {1, 2, 3, 4}
y = {3, 4, 5, 6}
x | y // x.union(y)
x & y // x.intersection(y)
x - y // x.difference(y)
x ^ y // x.symmetric_difference(y)
These methods all return a new set and leave the original untouched. When
efficient memory usage is needed, in-place variants of the methods are also
available, prefixed with in_place_
, e.g.
x = {1, 2, 3, 4}
y = {3, 4, 5, 6}
x.in_place_union(y)
x == {1, 2, 3, 4, 5, 6} // true
Sets of the same type are also comparable to determine if a set is a subset or superset of another, e.g.
x = {1, 2, 3}
y = {1, 2, 3, 4, 5, 6}
if x <= y {
// x is a subset of y
}
if x < y {
// x is a proper subset of y
}
if y > x {
// y is a superset of x
}
Boolean & Logic Types
Iroh supports basic boolean logic via the bool
type which has the usual values:
-
true
-
false
The bool
values can be negated using the !
operator, e.g.
x = true
y = !x
y == false // true
Boolean logic is applied for the and
and or
operators, and act in a
short-circuiting manner, e.g.
debug = false
// When debug is false, the fetch_instance_id call is never made.
if debug and fetch_instance_id() {
...
}
Iroh also supports a number of other logic types. Three-valued logic is
supported by the bool3
type which has the following values:
-
true
-
false
-
unknown
The unknown
value expands the usual logic operations with the following
combinations:
Expression | Result |
---|---|
true and unknown |
unknown |
false and unknown |
false |
true or unknown |
true |
false or unknown |
unknown |
!unknown |
unknown |
While three-valued logic can be emulated via an optional bool, i.e. ?bool
, the
explicit use of unknown
instead of nil
makes intent clearer in domains like
knowledge representation.
Four-valued logic is supported by the bool4
type with the values:
-
true
-
false
-
unknown
-
conflicting
The conflicting
value expands the usual logic operations with the following
combinations:
Expression | Result |
---|---|
true and conflicting |
conflicting |
false and conflicting |
false |
unknown and conflicting |
conflicting |
true or conflicting |
true |
false or conflicting |
conflicting |
unknown or conflicting |
conflicting |
!conflicting |
conflicting |
The use of bool4
can make logic much easier to follow, especially in domains
like distributed database conflicts, consensus algorithms, handling forks in
blockchains, etc.
Boolean values can be safely upcast to higher dimensions, e.g.
txn_executed = true
initial_status = bool4(txn_executed)
initial_status == true // true
The fuzzy
type supports fuzzy logic with fixed128
values between 0.0
(false
) and 1.0
(true
):
apply_brakes = fuzzy(1.0)
apply_brakes == true // true
Fuzzy logic is applied when evaluating fuzzy values, e.g.
Expression | Mechanism | Result |
---|---|---|
fuzzy(0.7) and fuzzy(0.4) |
min(a, b) |
fuzzy(0.4) |
fuzzy(0.7) or fuzzy(0.4) |
max(a, b) |
fuzzy(0.7) |
!fuzzy(0.7) |
1 - value |
fuzzy(0.3) |
Fuzzy membership is supported by various membership functions on slice data types that assign each element of the slice a value that represents its degree of membership, e.g.
-
x.gaussian_membership
-
x.s_shaped_membership
-
x.sigmoidal_membership
-
x.trapezoidal_membership
-
x.triangular_membership
-
x.z_shaped_membership
The in
operator can then be used to return the fuzzy
degree of membership of
a value, e.g.
// Temperature and humidity ranges.
temperature = [0..100]
humidity = [0..100]
// Temperature and humidity membership functions.
temp_low = temperature.trapezoidal_membership(0, 0, 20, 30)
temp_mid = temperature.trapezoidal_membership(25, 40, 60, 75)
temp_high = temperature.trapezoidal_membership(70, 80, 100, 100)
humid_low = humidity.triangular_membership(0, 0, 50)
humid_high = humidity.triangular_membership(50, 100, 100)
// Current conditions.
current_temp = 25
current_humidity = 60
// Get membership degrees.
temp_low_deg = current_temp in temp_low
temp_mid_deg = current_temp in temp_mid
temp_high_deg = current_temp in temp_high
humid_low_deg = current_humidity in humid_low
humid_high_deg = current_humidity in humid_high
// The membership degree values are fuzzy values:
type(temp_low_deg) == fuzzy // true
type(humid_low_deg) == fuzzy // true
// They can now be used to apply rules under fuzzy logic, e.g.
if temp_high_deg and humid_high_deg {
set_fan_speed(.high)
} else if temp_mid_deg and humid_low_deg {
set_fan_speed(.medium)
}
This is extremely useful in various domains like AI decision making, IoT, control systems, risk assessment, recommendation engines, autonomous vehicles, etc.
Probabilistic logic is supported by the probability
data type, e.g.
rain = probability(0.3, .bernoulli)
A range of different distribution types are supported:
enum {
bernoulli,
beta,
beta_binomial,
binomial,
categorical,
cauchy,
chi_squared,
dirichlet,
discrete_uniform,
exponential,
f_distribution,
gamma,
geometric,
gumbel,
hypergeometric,
laplace,
logistic,
lognormal,
multinomial,
multivariate_normal,
negative_binomial,
normal,
pareto,
poisson,
power_law,
rayleigh,
student_t,
triangular,
uniform,
weibull,
zipf,
}
Each distribution type requires different parameters depending on its mathematical definition, e.g.
weather = probability([
("sunny", 0.6),
("rainy", 0.3),
("cloudy", 0.1)
], .categorical)
height = probability(175.0<cm>, .normal, std: 7.0<cm>)
customers_per_hour = probability(5.0, .poisson)
income = probability(50000.0, .lognormal, sigma: 0.5)
successful_trials = probability(0.3, .binomial, trials: 20)
The likelihood of a given value can be queried by using the p
method. This
returns the concrete probability or, for continuous distributions, the
probability density, e.g.
dice = probability(1, .uniform, max: 6)
// The chance of a 3 being rolled:
chance = dice.p(3)
chance == 0.166666666666666667 // true
The probability that a value falls within a specific range can be calculated
using the p_between
method, e.g.
dice = probability(1, .uniform, max: 6)
// The chance of a 4-6 being rolled:
chance = dice.p_between(4, 6)
chance == 0.5 // true
The sample
method draws a random value from the distribution and sample_n
returns a slice of multiple samples of the specified length, e.g.
dice = probability(1, .uniform, max: 6)
roll = dice.sample()
rolls = dice.sample_n(10)
len(rolls) == 10 // true
Some distributions are naturally boolean, e.g.
coin = probability(0.5, .bernoulli)
success = probability(0.3, .binomial, trials: 1)
Others can be converted to a boolean distribution with a predicate, or with the
in_range
and in_set
membership tests, e.g.
temperature = probability(22.0, .normal, std: 3.0)
weather = probability([
("sunny", 0.6),
("rainy", 0.3),
("cloudy", 0.1)
], .categorical)
// Convert with predicates, e.g.
is_hot = temperature > 25.0
is_sunny = weather == "sunny"
// Membership tests, e.g.
likely_fever = temperature.in_range(38.0, 42.0)
bad_weather = weather.in_set({"rainy", "cloudy"})
Boolean distributions can then be used within conditionals either using an explicit threshold, e.g.
if collision_risk.p(true) > 0.1 {
apply_emergency_brakes()
}
Or implicitly where a probability > 0.5
indicates true
, e.g.
if likely_fever {
prescribe_medication()
}
Boolean distributions can be combined with normal bool
values, e.g.
rain = probability(0.7, .bernoulli)
weekend = true
stay_inside = rain and weekend
go_out = !rain or weekend
maybe_picnic = !rain and weekend
if maybe_picnic {
print("Let's have a picnic!")
}
Assuming distributions are statistically independent, probabilistic logic operations can be applied to them, e.g.
// Garden watering logic.
rain = probability(0.7, .bernoulli)
sprinkler = probability(0.3, .bernoulli)
wet_from_rain_or_sprinkler = rain or sprinkler
// Autonomous driving.
safe_to_proceed = (visibility > 100.0) and weather.in_set({"sunny", "cloudy"})
Boolean distributions can also be compared against each other, e.g.
market_crash = probability(0.15, .bernoulli)
recession = probability(0.25, .bernoulli)
if recession > market_crash {
adjust_investment_strategy()
}
Constraints for Bayesian inference can be defined using the observe
method on
probability
types to create probability_observation
values, e.g.
rain = probability(0.3, .bernoulli)
sprinkler = probability(0.1, .bernoulli)
wet_grass = rain or sprinkler
evidence_wet = wet_grass.observe(true)
These can then be used with the |
and &
operators to evaluate conditional
probabilities, e.g.
// Posterior distribution for rain given evidence of observed wet grass.
rain_given_wet = rain | wet_grass.observe(true)
// Posterior distribution for disease given evidence of fever and cough.
disease_given_symptoms = disease | fever.observe(true) & cough.observe(true)
For correlated variables, the joint_distribution
type can be used, e.g.
financial_risks = joint_distribution([
("market_crash", probability(0.15, .bernoulli)),
("recession", probability(0.25, .bernoulli))
], correlation: 0.8)
Multiple variables can define correlations following the standard upper triangle order of a correlation matrix, e.g.
financial_risks = joint_distribution([
("market_crash", probability(0.15, .bernoulli)),
("recession", probability(0.25, .bernoulli)),
("inflation_spike", probability(0.20, .bernoulli))
], correlations: [0.8, 0.3, 0.6])
These support the standard operations while maintaining the correlations within the joint distribution, e.g.
crash, recession, inflation = financial_risks.sample()
The correlated probability values can be accessed via the dists
method to do
various calculations, e.g.
market_crash, recession, inflation_spike = financial_risks.dists()
// Joint probabilities, e.g.
disaster = market_crash and recession
// Conditional probabilities, e.g.
recession_given_crash = recession | market_crash.observe(true)
Our support for rich probabilistic logic makes it useful in various domains like financial modelling, risk modelling, statistical modelling, Bayesian inference, medical diagnosis, robotics, etc.
Struct Data Types
Iroh provides struct
types to group together related fields under a single
data type, e.g.
Person = struct {
name string
age int
height fixed128<cm>
siblings []Person
}
Struct values can be initialized with braces and individual fields can be
accessed using .
notation, e.g.
zeno = Person{
name: "Zeno",
age: 11,
height: 136<cm>
}
print(zeno.name) // Outputs: Zeno
Unspecified values are zero-initialized, e.g.
-
Numeric types default to
0
. -
Boolean types default to
false
. -
String types default to
""
, the empty string. -
Slice types default to
[]
, the empty slice. -
Optional types default to
nil
. -
Struct types have all their values zero initialized.
For example:
alice = Person{name: "Alice"}
alice.age == 0 // true
alice.siblings == [] // true
Struct definitions can specify non-zero default values, e.g.
Config = struct {
retries: 5
timeout: 60<s>
}
cfg = Config{retries: 3}
cfg.retries == 3 // true
cfg.timeout == 60<s> // true
Struct fields can be assigned to directly, e.g.
cfg = Config{}
cfg.retries = 7
cfg.retries == 7 // true
If a field being initialized matches an existing variable name, the {ident: ident}
initialization can be simplified to just {ident}
, e.g.
retries = 3
cfg = Config{retries, timeout: 10<s>}
cfg.retries == 3 // true
Likewise, the ...
splat operator can be used to assign all fields of an
existing struct to another, with later assignments taking precedence over
former, e.g.
ori = Config{retries: 3, timeout: 10<s>}
cfg1 = Config{...ori}
cfg1 == {retries: 3, timeout: 10<s>} // true
cfg2 = Config{...ori, retries: 5}
cfg2 == {retries: 5, timeout: 10<s>} // true
cfg3 = Config{retries: 5, ...ori}
cfg3 == {retries: 3, timeout: 10<s>} // true
cfg4 = Config{retries: 5, ...ori, timeout: 20<s>}
cfg4 == {retries: 3, timeout: 20<s>} // true
Iroh automatically detects when a struct should be passed as a pointer or copied based on some heuristics, so there is no need to specify whether it should be a pointer, e.g.
-
If a function/method needs to mutate a struct value, it is passed as a pointer.
-
Where there is no mutation, and the struct value is small enough, it is just copied.
For example:
// Person is automatically passed as a pointer as it is mutated:
func update_age(person Person, new_age int) {
person.age = new_age
}
// Config value is copied as it is not mutated and small enough:
func get_retries(cfg Config) int {
return cfg.retries
}
Occasionally, for performance reasons, it may be necessary to annotate the exact form that’s passed in. But this should be rarely used, e.g.
// Person is passed in as a *pointer even though it's not mutated:
func process_application(person *Person, info *Submission) {
...
}
// The time value is passed in by !value and copied:
func get_local_time(t !UnixTime) {
...
}
All pointers, outside of the [*]
pointers we have for interoperating with C,
are always non-null and automatically dereferenced on lookups and assignments.
Tuple Data Types
Iroh supports tuples for grouping a fixed number of values together. Tuple
values are enclosed within ()
parentheses, and indexed with []
like slices,
e.g.
london = (51.5074, -0.1278)
lat = london[0]
lng = london[1]
lat == 51.5074 // true
lng == -0.1278 // true
Tuple values can be destructured easily, e.g.
lat, lng = (51.5074, -0.1278)
lat == 51.5074 // true
lng == -0.1278 // true
Like structs, tuples can contain values of different types, e.g.
tav = ("Tav", 186<cm>)
However, unlike structs, tuples are immutable, i.e. their elements cannot be re-assigned:
london = (51.5074, -0.1278)
london[0] = 42.12 // ERROR!
Individual fields of a tuple can also be optionally named and accessed via their name, e.g.
london = (lat: 51.5074, lng: -0.1278)
london.lat == 51.5074 // true
london.lng == -0.1278 // true
// Indexed access still works
london[0] == london.lat // true
Both named and unnamed fields can be mixed within the same tuple, e.g.
tav = ("Tav", height: 186<cm>)
tav[0] == "Tav" // true
tav.height == 186<cm> // true
To avoid accidental bugs, the order of named fields must always match an expected tuple type, e.g.
func get_coords() (lat: float64, lng: float64) {
return (lng: -0.1278, 51.5074) // ERROR!
}
As tuples are iterable, len
returns their size, and they can be iterated with
for
loops, e.g.
london = (51.5074, -0.1278)
for point in london {
print(point) // Outputs: 51.5074 and then -0.1278
}
len(london) == 2 // true
Occasionally, single element tuples are useful, e.g.
-
Within APIs that are expecting tuple values.
-
To maintain consistency within data structures.
As single element tuples, e.g. (x)
, are indistinguishable from an expression
enclosed in parentheses, languages like Python let them be constructed with a
trailing comma, e.g.
# Constructing a single-element tuple in languages like
# Python and Rust:
t = (42,)
t[0] == 42 # True
This tends to cause confusion as trailing commas can be overlooked by even experienced developers. As such, we don’t provide a literal syntax for constructing single-element tuples.
Instead, they will need to be constructed using the tuple
constructor. This
constructor expands any iterable, e.g.
t = tuple("Tav")
len(t) == 3 // true
t[0] == "T" // true
Therefore, to construct single-element tuples, the value will need to be embedded within a single-element slice, e.g.
t = tuple(["Tav"])
len(t) == 1 // true
t[0] == "Tav" // true
Enum Data Types
Iroh supports enum data types, e.g.
Colour = enum {
red
green
blue
}
Variants can be addressed with a leading .
and matched with the match
keyword, e.g.
match colour {
.red: set_red_bg()
.green: set_green_bg()
.blue: set_blue_bg()
}
The use of .
prefix for enum variants makes it less visually noisy than in
other languages, e.g.
// Compare this Rust:
set_bg(Colour::Red)
// To this Iroh:
set_bg(.red)
So the only time when an enum variant will need to be fully qualified is when it’s used to declare a new variable, e.g.
bg = Colour.red
// The variable can then be assigned another variant without
// needing to use the fully qualified form, e.g.
bg = .green
Enum variants without any data do not get any implicit values, e.g.
Colour = enum {red, green, blue}
Colour.red == Colour.green // false
Colour.red == 0 // ERROR! Not comparable
Variants can be given explicit values as long as they are all of the same type. This makes the variants comparable to values of those types, e.g.
Colour = enum {
red = 1
green = 2
blue = 3
}
HTTPMethod = enum {
get = "GET"
head = "HEAD"
post = "POST"
put = "PUT"
}
Colour.red == 1 // true
HTTPMethod.get == "GET" // true
Mapped enums such as these automatically support being assigned values of the variant type, e.g.
http_method = HTTPMethod.get
http_method = .post // Assigned an enum variant
http_method = "PUT" // Assigned a string value mapping to an enum variant
When variables of these types are assigned literal values, they are validated at edit-time. Otherwise, they are validated at runtime and may generate errors, e.g.
http_method = HTTPMethod.post
http_method = "JIBBER" // ERROR!
As runtime values will be automatically converted during comparison, it may be useful to safely type cast them explicitly first, e.g.
http_method = try HTTPMethod(user_input)
if http_method == .get {
...
}
When numeric values are being assigned to variants, they can use iota
to
define a pattern to use:
-
The value of
iota
starts at0
and auto-increments by one for each subsequent variant that doesn’t define its own value. -
Each new use of
iota
resets its value back to0
.
For example:
ErrorCode = enum {
// General errors (start at 0)
ok = iota // 0
unknown_method // 1
invalid_argument // 2
// Authentication errors (start at 100)
auth_failed = iota + 100 // 100
token_expired // 101
permission_denied // 102
// Network errors (start at 200)
timeout = iota + 200 // 200
connection_lost // 201
protocol_mismatch // 202
}
By default, an enum’s tag will be the smallest unsigned integer that can represent all the enum’s possible values, e.g.
// Iroh will use a uint2 for this enum's tag as the 2 bits
// will be able to represent all 4 possible values:
Status = enum {a, b, c, d}
When the in-memory layout needs to be controlled, a custom size can be specified instead, e.g.
// A 20-bit tag will be used here, even though only 4 variants
// have been defined so far:
Status = enum(uint20) {a, b, c, d}
When the size exceeds 8 bits, a qualifier can be added to the custom size to control its endianness, e.g.
Status = enum(uint20be) {a, b, c, d}
The following qualifiers are supported:
-
be
— big-endian. -
le
— little-endian (the default). -
ne
— native-endian matching whatever the current CPU/OS uses.
Matches on variants must be exhaustive, i.e. they must match all possible variants, e.g.
Status = enum {a, b, c, d}
// This match will error at edit-time as the .d variant
// hasn't been handled:
match status {
.a:
.b:
.c:
}
This helps prevent any bugs caused by accidentally forgetting to handle certain
cases. The default
keyword can be used to act as a “catch-all”, e.g.
match http_method {
.get: handle_get()
.post: handle_post()
default: handle_everything_else()
}
This is useful for enums that might grow in the future when APIs evolve, and ensures that existing code won’t break, e.g.
The exhaustive nature of enum matches can make it difficult for package authors to evolve their APIs, e.g.
// For an enum like this defined by a package author:
HTTPMethod = enum {
get = "GET"
head = "HEAD"
post = "POST"
put = "PUT"
}
// The following will compile as the match handles all 4 cases:
match http_method {
.get: handle_get()
.head: handle_head()
.post: handle_post()
.put: handle_put()
}
// But, if the package author adds a new enum .trace variant,
// like below, the `match` will fail as it's not exhaustive
// any more:
HTTPMethod = enum {
get = "GET"
head = "HEAD"
post = "POST"
put = "PUT"
trace = "TRACE"
}
To ensure that user code continues to work if a package author decides to add
new variants, enums can be marked as extensible
, e.g.
HTTPMethod = enum(extensible) {
get = "GET"
head = "HEAD"
post = "POST"
put = "PUT"
}
This will force all matches on those enums to always have a default
case, so
that they will continue to be exhaustive even if package authors add new
variants, e.g.
match http_method {
.get: handle_get()
.head: handle_head()
.post: handle_post()
.put: handle_put()
default: handle_everything_else()
}
// When the enum gets extended with a new .trace variant, this
// match will continue to work thanks to the `default` case.
Multiple variants can be grouped with ;
in a single case. Within such
branches, values are automatically type narrowed to only those variants, e.g.
match http_method {
.get:
handle_get()
.post; .put:
// Only .post and .put are valid variants inside here.
handle_common_code_for_post_and_put()
match http_method {
.post:
handle_post_only_stuff()
.put:
handle_put_only_stuff()
// No need to handle the other cases here.
}
.head:
handle_head()
default:
handle_anything_else()
}
Like in Rust, enum variants can also have associated structured data of different kinds, e.g.
Message = enum {
quit // No data
move{ // Struct data
x int32
y int32
}
write(string) // Single-element tuple
set_colour(r: uint8, g: uint8, b: uint8) // Multi-element tuple
}
These can be destructured within the match
cases, e.g.
func process(msg Message) {
match msg {
.quit: // Handle quit
.move{x, y}: // Handle move
.write(text): // Handle write
.set_colour(r, g, b): // Handle set_colour
}
}
Field Annotations
Composite types, i.e. struct
and enum
types, can be annotated with
structured data. The annotations can be on individual fields or the type as a
whole, e.g.
type Person = struct {
dob date json.Field{date_format: .rfc3339}
name string sql.Field{name: "user_name"}
updated_at ?timestamp json.Field{omit_empty: true}
} sql.Table{name: "people"}
Unlike in Go, where annotations have to be shoved into string values, e.g.
type Person struct {
Name string `json:"name" db:"user_name"`
}
Iroh annotations can be any compile-time evaluatable value. This can be used to easily add things like custom encodings, serialization, validation, etc.
BlogPost = struct {
contents encoded_string string_encoding.iso_8859_1
}
Application code can introspect the specific annotations at compile-time to drive behaviour, e.g.
annotation = Person.fields["updated_at"].annotation[json.Field]
if annotation {
if annotation.omit_empty {
// skip empty value ...
}
}
Destructuring
Iroh supports destructuring for many of its data types, e.g.
// Tuples
(a, b, c) = (1, 2, 3)
// Arrays
[a, b, c] = [3]int{1, 2, 3}
// Slices
[a, b, c] = [1, 2, 3]
// Structs
{x, y} = Point{x: 10, y: 20}
// Strings
<<"Hello ", name>> = "Hello Tav"
For elements of iterable values, specific elements can be skipped with a _
,
e.g.
[a, _, b] = [1, 2, 3]
The ...
splat operator can be used to match any “remaining” elements of an
iterable, e.g.
[head, second, _, tail...] = [1, 2, 3, 4, 5]
head == 1 // true
second == 2 // true
tail == [4, 5] // true
The ...
splat operator can be in any position as long as it’s only used once,
e.g.
[head, middle..., last] = [1, 2, 3, 4, 5]
head == 1 // true
middle == [2, 3, 4] // true
last == 5 // true
Or even use it to ignore a group of intermediate elements, e.g.
[head, ..., last] = [1, 2, 3, 4, 5]
head == 1 // true
last == 5 // true
As struct fields destructure by their field names, partial destructuring only needs to specify the desired field names, e.g.
{name} = Person{name: "Tav", location: {lat: 51.5074, lng: -0.1278}}
name == "Tav" // true
Fields can be destructured to a different name with a :
, e.g.
{name: user} = Person{name: "Tav", location: {lat: 51.5074, lng: -0.1278}}
user == "Tav" // true
Nested elements can be destructured as needed, e.g.
{location: {lat, lng}, name} = Person{name: "Tav", location: {lat: 51.5074, lng: -0.1278}}
name == "Tav" // true
lat == 51.5074 // true
lng == -0.1278 // true
Destructured elements can set a default value with a :
if the value is a
zero-value, e.g.
{name: "Anonymous"} = Person{}
name == "Anonymous" // true
This can often be clearer than manually assigning to a local variable and checking its value before setting a default, e.g.
// Compare:
name = person.name
if !name {
name = "Anonymous"
}
// Versus:
{name: "Anonymous"} = person
The ()
, []
, and {}
around destructured patterns can be elided when
multiple elements are being destructured without any nesting or field renaming,
e.g.
a, _, b = [1, 2, 3]
x, y = Point{x: 10, y: 20}
a == 1 // true
b == 3 // true
x == 10 // true
y == 20 // true
The Erlang-inspired <<x, y>>
binary destructuring works well with both strings
and byte slices, e.g.
// Decode binary data into specific data types, e.g.
<<version::uint8, length::uint32, checksum::int16>> = data
// Multi-byte integer types can specify an alternative to the
// default little-endian decoding, e.g.
<<version::uint8, length::uint32be, checksum::int24>> = data
// The splat operator can be used as usual, e.g.
<<version::uint8, header...>> = data
// The number of bytes to destructure can be specified with
// expressions or integer literals, e.g.
<<header::56, payload...>> = data
// These can even refer to previously destructured data, e.g.
<<version::uint8, length::uint32, header::length, payload...>> = data
Functions & Methods
The func
keyword is used to define both functions and methods. Functions can
have both parameters and a return value, e.g.
// This takes 2 parameters and returns an int value:
func add(a int, b int) int {
return a + b
}
// This takes no parameters and returns a 2-tuple value:
func get_info() (name: string, height: int<cm>) {
return ("Tav", 186<cm>)
}
// This takes 2 parameters and returns nothing:
func set_info(id int, info (name: string, height: int<cm>)) {
cache.update(id, info)
}
Closures, i.e. anonymous functions with associated data, can be defined within
function and method bodies using the =>
syntax, e.g.
add = (a int, b int) => {
return a + b
}
If the anonymous function body is on the same line as the =>
, then the
return
is implicit, i.e.
// Return is implicit whether there are braces, e.g.
add = (a int, b int) => { a + b }
// Or not:
add = (a int, b int) => a + b
Anonymous functions cannot define their return type, and thus the type of the return value must be inferrable. Functions which don’t take any parameters can omit them, e.g.
newline = () => print("")
// Print 3 newlines:
newline()
newline()
newline()
If anonymous functions don’t specify any parameters, but do receive parameters,
then those parameters need to be inferrable, and can be referred to by their
position, e.g. $0
, $1
, etc.
people = ["Reia", "Zaia"]
// This explicit function passed to the `map` method:
people.map((person string) => {
return person.upper()
})
// Can be simplified by omitting the parameter:
people.map(() => {
return $0.upper()
})
// Can be further simplified by putting it all on one line:
people.map(() => $0.upper())
// And even more clearly to just:
people.map { $0.upper() }
Functions, methods, and closures can all be passed as parameters in function calls, and even saved as values, e.g.
func calc(a int, b int, op (int, int) => int) {
return op(a, b)
}
// Pass the add function as a parameter:
calc(1, 2, add)
Instruction = struct {
op (int, int) => int
a int
b int
}
// Store the add function as a struct field:
next = Instruction{
op: add,
a: 1,
b: 2
}
Functions can be variadic if a parameter name is prefixed with ...
, e.g.
func print_names(...names string) {
for name in names {
print(name)
}
}
The variadic parameter is automatically a slice of the given type, and can be called with zero or more values, e.g.
print_names() // Outputs: nothing
print_names("Zeno", "Reia", "Zaia") // Outputs: all 3 names
Slices can use the ...
splat operator to expand their elements when calling
functions with variadic parameters, e.g.
names = ["Alice", "Tav"]
print_names(...names)
Parameters can be given default values by following the parameter name with a
:
and the default value, e.g.
func greet(name string, greeting: "Hello") {
return "${greeting} ${name}"
}
greet("Alice") // Outputs: Hello Alice
greet("Tav", "Hi") // Outputs: Hi Tav
Iroh will generate edit-time warnings for certain function definitions, e.g.
-
Functions that have more than 6 named parameters as these tends to result in unwieldy APIs.
-
Functions that use
bool
parameters instead of clearerenum
ones, e.g.bar(123, true)
vs.bar(123, .update)
.
Functions which take a function parameter at the last position can use block syntax for that parameter, e.g.
func calc(a int, b int, op (int, int) => int) {
return op(a, b)
}
// We call `calc` using block syntax for the `op` parameter:
calc(1, 2) { $0 + $1 }
When a parameter is a struct type, it can be inlined by eliding the struct
keyword, e.g.
func run_server(cfg {host string, port int}) {
...
}
Functions which take a struct
parameter at the last position can
accept the struct fields as “named” arguments in the function call, e.g.
Config = struct {
log_level enum{info, debug, fatal}
port int
}
func run_server(cfg Config) {
...
}
// The Config fields can be passed in as "named" arguments, e.g.
run_server(log_level: .info, port: 8080)
// As Config fields will be default-initialized, only any
// necessary fields need to be specified, e.g.
run_server(port: 8080)
Function parameters can combine default values, variadic parameters, struct parameters, and trailing functions as long as they follow this order:
-
Positional parameters with types.
-
Positional parameters with default values, i.e. optional parameters, or a variadic parameter.
-
Trailing
struct
parameter (with optional default field values). -
Trailing function parameter.
Parameters can also use destructuring syntax, e.g.
Config = struct {
log_level enum{ info, error, fatal }
port int
}
func run_server({port} = Config) {
// Only the `port` value is available in this scope.
}
func run_server({port: 8080} = Config) {
// The `port` value defaults to 8080 if it's not been specified.
}
For certain types of destructuring, types can be elided, e.g. when destructuring binary data:
func parse_packet(<<header::56, payload...>>) {
...
}
User-defined types “inherit” all of the methods of their underlying type and
have certain types autogenerated for them, e.g. struct
types have the
following methods created for them:
-
copy
,copy_with
,deepcopy
— for copying the value. -
__hash__
— for deriving the hash value. -
__str__
— for the default string representation of the value.
Custom methods can be defined on a type by prefixing the type name before the
func
definition, e.g.
Config = struct {
host string
port int
}
// Define the `validate` method:
(c Config) func validate() bool {
if !c.host {
return false
}
if 1024 <= c.port <= 65535 {
return true
}
return false
}
// Use the `validate` method:
c = Config{host: "", port: 8080}
if c.validate() {
...
}
The receiver component of the method definition, i.e. (c Config)
, can use any
variable name to refer to the value of the specified type. There’s no implicit
this
or self
.
Static methods on the type can be defined by specifying type
as the receiver
name, e.g.
(type Config) func web_server() Config {
return Config{
host: "localhost",
port: 8080
}
}
// The static method can now be called on the type:
cfg = Config.web_server()
Conditionals
Iroh uses if
, if else
, and else
blocks like most languages, e.g.
if i%3 == 0 and i%5 == 0 {
print("FizzBuzz")
} else if i%3 == 0 {
print("Fizz")
} else if i%5 == 0 {
print("Buzz")
} else {
print(i)
}
The condition for if
and if else
need to evaluate to a bool
value. If the
value can’t be converted, it will generate an error, e.g.
Person = struct {
first_name string
last_name string
}
user = Person{}
// The following will generate a compile-time error as user
// cannot be converted to a bool.
if user {
...
}
For convenience, most built-in types support conversion, e.g.
name = "Tav"
// Instead of this explicit conditional check:
if len(name) > 0 {
...
}
// The string can be used directly as string values only evaluate
// to true when they have a positive length.
if name {
...
}
The falsiness of values of built-in types is given by:
Type | When False |
---|---|
Booleans | false |
Numbers | 0 |
Arrays | len(x) == 0 |
Slices | len(x) == 0 |
Strings | len(x) == 0 |
Maps | len(x) == 0 |
Within conditionals, one can check if a value is within a range, e.g.
// Check if (age >= 21) and (age <= 35):
if 21 <= age <= 35 {
...
}
Assignments and destructuring can also be done within an if
conditional as
long as a ;
separated conditional is also checked, e.g.
if resp = api.get_user(id); resp.success {
// The resp variable does not pollute the outer scope.
}
User-defined types can define a __bool__
method if they want to opt into
automatic type coercion into bools, e.g.
State = struct {
command string
is_running bool
}
func (s State) __bool__() bool {
return s.is_running
}
state = State{}
if state { // Automatically checks state.is_running
...
}
Loops
Most languages tend to provide multiple constructs for looping, e.g. for
,
while
, do
, foreach
, repeat
, etc. This can be slightly confusing for
those new to programming.
So Iroh instead follows Go’s approach and only uses one keyword, for
, for all
looping. Using for
by itself results in an infinite loop, e.g.
for {
// Will keep running code in this block indefinitely.
}
Loops can be broken with the break
keyword, e.g.
for {
now = time.now()
// This loop will stop as soon as the year ticks over.
if now.year > 2025 {
break
}
print(now)
time.sleep(1<s>)
}
Loops can be C-like, i.e.
for initialization; condition; increment {
...
}
For example:
for i = 0; i < 5; i++ {
print(i) // Prints 0, 1, 2, 3, 4
}
Loops can use the continue
keyword to skip to the next iteration, e.g.
for i = 0; i < 5; i++ {
if i == 1 {
continue
}
print(i) // Prints 0, 2, 3, 4
}
To avoid common bugs, e.g. when loop variables are captured by closures, loop variables have per-iteration scope instead of per-loop scope, e.g.
for i = 0; i < 5; i++ {
func print_value() {
print(i)
}
print_value() // Prints 0, 1, 2, 3, 4
}
Loops can be nested, and labels can be used to exit specific loops, e.g.
outer: // a label for the outer loop, can be any identifier
for i = 0; i < 3; i++ {
for j = 0; j < 3; j++ {
print("={i}, ={j}")
if i*j == 4 {
print("Breaking out of both loops")
break outer
}
}
}
To execute code only when a loop has not been interrupted with a break
, the
for
loop can be followed by a fully
branch, e.g.
for i = 0; i < 3; i++ {
if i == 2 {
break
}
print(i)
} fully {
print("All values got printed!")
}
Loops can also be conditional, i.e. will keep looping while the condition is
true
, e.g.
for len(x) > 0 {
print(x.pop())
}
To loop over ranges, the for
keyword can be combined with the in
keyword,
e.g.
for i in 0..5 {
print(i) // Prints 0, 1, 2, 3, 4, 5
}
This also works with collections, e.g. iterating over a slice:
users = [{"name": "Tav"}]
for user in users {
print(user["name"])
}
Most collections also support iterating using in
with 2 variables, e.g. in
slices this will return each element’s index as well as the element itself:
users = [{"name": "Tav"}]
for idx, user in users {
print("${idx}: ${user["name"]}")
}
Similarly, iterating using just 1 variable over a map
value gives just the
keys, e.g.
user = {"name": "Tav", "location": "London"}
for key in user {
print(key) // Prints name, location
}
While iterating using 2 variables gives both the key and the value, e.g.
user = {"name": "Tav", "location": "London"}
for key, value in user {
print("${key} = ${value}")
}
When the loop variables are not needed, they can be elided, e.g.
for 0..5 {
...
}
Unlike boolean conditions which are re-evaluated on every loop, iterable expressions are only evaluated once. This is particularly useful when the iterables yield lazily, e.g.
for rate_limiter.available_slots() {
handle_next_request()
}
User-defined types can add support for iteration by defining the __iter__
method which needs to return a type implementing the built-in iterator
interface.
Types implementing iterator
need to have a __next__
method which returns the
next value in the sequence, or nil
when the iteration is complete, e.g.
Counter = struct {
current int
max int
}
func (c Counter) __iter__() iterator {
return c
}
func (c Counter) __next__() ?int {
if c.current < c.max {
current = c.current
c.current++
return current
}
return nil
}
counter = Counter{current: 2, max: 5}
for i in counter {
print(i) // Prints 2, 3, 4
}
Const Values
Iroh supports const
values of different kinds. If the const
is on the left
hand side, then the value is compile-time evaluated, e.g.
const x = factorial(5)
x == 120 // true
All dependencies of such evaluations need to be compile-time evaluatable and cannot depend on runtime input, e.g.
const private_key_modulus = read_modulus() // ERROR!
func read_modulus() uint4096 {
return uint4096(io.read_all(stdin))
}
The @read_file
and @read_url
compile-time functions allow for reading
various resources at compile-time, e.g.
const private_key = @read_file("~/.ssh/id_ed25519")
const logo_png = @read_url("https://assets.espra.com/logo.png")
Compile-time reads are cached on first read, and need to be explicitly uncached
by running a clean build or by marking the resource with watch
, e.g.
const config_data = @read_file("config.json", watch: true)
This mechanism is what powers our build system, e.g.
import "build"
func build(b build.Config) {
glfw = b.add_static_library("glfw", sources: glob("glfw/src/*.c"))
.add_include_path("glfw/include")
b.add_executable(root: "src/main.iroh")
.link_library(glfw)
.install()
}
Instead of needing a separate build config and language, like CMake, Autotools, or Gradle, we have the full power of Iroh available in our compile-time build system.
Compile-time const values can be defined within the top-level scope of a package or within function bodies, e.g.
func serialize_value() {
const debug = false
if debug {
...
}
}
Optimizations are then done based on these compile-time values, e.g. in the
above example, the entire if debug
block will be fully optimized away.
If const
is on the right side of an assignment, i.e. at the head of an
expression, then it is not compile-time evaluated, and instead marks the result
of the expression as immutable, e.g.
path = "/home/tav"
split = const path.split("/")
split[1] = "alice" // ERROR! Cannot mutate an immutable value
When a value is marked as immutable, it can no longer be mutated. To support this, the compiler will try to re-use existing allocations wherever possible, and only make copies when necessary.
Refined Types
Types can be constrained to specific sub-ranges using @limit
which limits a
type with a constraint. Constraint expressions can refer to an instantiated
value of the type as this
, e.g.
Codepoint = @limit(uint, this <= 0x10ffff)
Port = @limit(uint16, this >= 1024 and this <= 65535)
Format = @limit(string, this in ["json", "toml", "yaml"])
Any compile-time evaluatable expression can be used as the constraint. When values of constrained types are initialized, literals are validated at compile-time, otherwise at runtime, e.g.
port1 = Port(8080) // Validated at compile-time (literal)
port2 = Port(user_input) // Validated at runtime (dynamic value)
Constraints are validated whenever there are any changes that could invalidate it, e.g.
StringList = @limit([]string, len(this) > 0)
x = StringList{"Tav"}
x.pop() // ERROR!
Constraining a type to a set of specific values can be written by prefixing them
with a const
and using |
to separate the options, e.g.
Format = const "json" | "toml" | "yaml"
Priority = const 1 | 2 | 3
When constraining non-numeric types like strings to a set of values, the const
prefix can be elided as the |
bitwise OR operator doesn’t apply to them, e.g.
Format = "json" | "toml" | "yaml"
But the const
prefix will still be needed if only one value is possible, e.g.
Format = const "json"
As string values are already immutable, this creates a value which doubles as both a type value with a single string value as well as an immutable string value.
Consumable Types
Struct types can be annotated as being consumable
in order to treat them as
linear types, e.g.
Transaction = struct(consumable) {
...
}
func (t Transaction) set_key(key string, value V) {
...
}
@consume
func (t Transaction) commit() {
...
}
@consume
func (t Transaction) rollback() {
...
}
All values of a consumable type must be discarded by calling a method that has
been marked with the @consume
decorator, e.g.
txn = db.new_txn()
txn.set_key("admin", "Tav")
txn.commit()
Failure to do so will result in an edit-time error, e.g.
txn = db.new_txn()
txn.set_key("admin", "Alice")
// ERROR! Neither txn.commit() nor txn.rollback() were called!
Once a value has been consumed, it can no longer be used, e.g.
txn = db.new_txn()
txn.set_key("admin", "Alice")
txn.commit()
txn.set_key("admin", "Zeno") // ERROR! txn value already consumed!
Up to one @consume
method can be marked as default
, e.g.
@consume(default: true)
func (f File) close() {
...
}
This will automatically consume the method with this method if it hasn’t been explicitly consumed by the time it goes out of scope, e.g.
if update_contents {
f = create_file("/home/tav/${filename}.md")
f.write(contents)
// The file is automatically consumed by f.close() here.
}
This enables package authors to provide APIs that are ergonomic and safe, without needing any manual cleanup.
Expression-Based Assignment
Languages with a strong emphasis on expressions can easily lead to code with poor readability and cognitive load, e.g. consider this Rust code:
let final_price = {
let base_price = if item.category == "premium" { if user.is_vip { item.price * 0.8 } else { item.price * 0.9 } } else { item.price };
let shipping_cost = if base_price > 50.0 { if user.location == "remote" { 15.0 } else { 0.0 } } else { 8.0 };
let tax_amount = if user.state == "CA" { if base_price > 100.0 { base_price * 0.08 } else { base_price * 0.06 } } else { base_price * 0.05 };
let total = base_price + shipping_cost + tax_amount;
if total > 200.0 {
if user.membership.is_some() {
total - user.membership.unwrap().discount_amount
} else {
total * if user.first_time_buyer { 0.95 } else { 1.0 }
}
} else {
if user.has_coupon {
if coupon.min_purchase <= total { total - coupon.amount } else { total }
} else {
if user.loyalty_points > 1000 { total - 10.0 } else { total }
}
}
};
An accidental semicolon somewhere can easily change the meaning of the entire calculation. To minimize such issues, Iroh takes a more pragmatic approach to expression-based assignments.
Expressions beginning with certain keywords like if
and else
can implicitly
assign their block value to a variable as long as there are no further nested
constructs, e.g.
base_price = if user.is_vip { price * 0.8 } else { price }
If multi-line computations are needed, or if nested constructs need to be used,
the block is auto-indented and the value being assigned needs to be explicitly
prefixed with a =>
, e.g.
base_price =
if item.category == "premium" {
if user.is_vip {
=> item.price * 0.8
} else {
=> item.price * 0.9
}
} else {
=> item.price
}
The do
keyword can be used to evaluate multi-line expression blocks for
assignment, e.g.
value = do {
temp = expensive_calculation()
=> temp * 2 + 1
}
Similar to how return
works within function bodies, =>
ends computation
within a block, i.e.
value = do {
temp = expensive_calculation()
=> temp * 2 + 1
// The following code will be unreachable, just like after a return.
print("This won't execute")
}
Variables declared within do
blocks do not pollute the outer scope.
Expression-based assignment can also be nested, e.g.
base_price = do {
discount =
if item.category == "premium" {
if user.is_vip {
=> 0.2
} else {
=> 0.1
}
} else {
=> 0.0
}
=> item.price * (1 - discount)
}
For nested expressions where assignment is to an outer block, labels can be used
and assignments can use the form =>label value
, e.g.
base_price = outer: do {
discount =
if item.category == "premium" {
if user.is_genesis_member {
=>outer 0
} else if user.is_vip {
=> 0.2
} else {
=> 0.1
}
} else {
=> 0.0
}
=> item.price * (1 - discount)
}
Built-in Functions
Besides built-in types like time
, Iroh provides various built-in functions to
make certain common operations easier, e.g.
-
cd(path)
-
Change the working directory to the given path.
cd("/home/tav")
-
-
cap(slice)
-
Return the capacity of the given slice.
x = make([]int, len: 100) cap(x) == 100 // true
-
-
exit(code: 0)
- Exit the process with the given status code.
-
fprint(writer, ...args, end: "")
-
Writes the given arguments to the writer using the same formatting as the
print
function.fprint(my_file, "Hello world!")
-
-
glob(pattern)
-
Returns files and directories matching the given pattern within the current working directory.
for path in glob("*.ts") { // Do something with each .ts file }
-
-
len(iterable)
-
Return the length of the given iterable.
len(["a", "b", "c"]) == 3 // true
-
-
max(...args)
-
Returns the maximum of the given values.
max(1, 12, 8) == 12 // true
-
-
min(...args)
-
Returns the minimum of the given values.
min(1, 12, 8) == 1 // true
-
-
print(...args, end: "\n")
-
Prints the given arguments to the standard output.
// Output with a newline at the end: print("Hello world!") // Output without a newline at the end: print("Hello world!", end: "")
-
-
print_err(...args, end: "\n")
-
Prints the given arguments to the standard error.
// Output with a newline at the end: print_err("ERROR: Failed to read file: /home/tav/source.txt") // Output without a newline at the end: print_err("ERROR: ", end: "")
-
-
read_input(prompt: "", raw: false)
-
Read input from the standard input.
name = read_input("Enter your name: ")
-
-
read_passphrase(prompt: "", mask: "*")
-
Read masked passphrase from the standard input.
passphrase = read_passphrase("Passphrase: ")
-
-
type(value)
-
Return the type of the given value. The fields of the response, i.e. the type value, can only be accessed at compile-time.
type("Hello") == string // true match type("hello") { string: print("Got a string value!") int: print("Got an int value!") default: print_err("Got an unknown value!") }
-
Along with various compile-time functions, e.g.
-
@col
- Convert an array/slice from row-major ordering to column major.
-
@consume
- Mark type methods that “consume” the type value.
-
@decorator
- Mark a function/method a compile-time decorator.
-
@limit(type, constraints...)
- Create a refined type which is validated against the specified constraints.
-
@read_file(filepath, watch: false)
- Returns the byte slice contents of reading the given file.
-
@read_url(url, watch: false)
- Returns the byte slice contents of reading the given URL.
-
@relate(unit, relationships...)
- Relate units to each other through equality definitions.
-
@row
- Convert an array/slice from column-major ordering to row major.
-
@transpose
- Transpose the layout of an array/slice, e.g. row major to column major, and vice-versa.
-
@undo
- Mark a specialized method as the variant to use when undoing a method call
within an
atomic
block.
- Mark a specialized method as the variant to use when undoing a method call
within an
Inlining
The inline
keyword tells Iroh to try and inline a function or loop at the call
site for better performance, e.g.
inline func add(a int, b int) int {
return a + b
}
As a result of the inline func
hint, Iroh will directly insert the code of
add
wherever it’s called instead of doing a regular function call, e.g.
// This code:
c = add(a, b)
// Gets transformed into:
c = a + b
This can be beneficial in performance critical code as the overhead of the
function call is eliminated. Similarly, noinline func
can be used to prevent a
function from being inlined, e.g.
noinline func something_complex() {
...
}
This can be useful in a number of cases, e.g. better debugging thanks to improved stack traces, minimizing instruction cache misses caused by ineffective inlining, etc.
The inline for
mechanism can be used to inline loops, e.g.
elems = [1, 2, 3]
inline for elem in elems {
process(elem)
}
If the length of the iterable is known at compile time, this will act as a hint to unroll the loop, i.e.
// This code:
inline for elem in elems {
process(elem)
}
// Gets transformed into:
process(elem[0])
process(elem[1])
process(elem[2])
// Or if the element values are known, perhaps even:
process(1)
process(2)
process(3)
If the length of the iterable isn’t known at compile time, then the inline for
will act as a hint to the compiler to more aggressively optimize the loop, e.g.
-
Inline any small function calls within the loop body.
-
Move any loop-invariant code outside of the loop.
-
If possible, convert the loop to use SIMD instructions.
Transmuting Values
The as
keyword allows for a value to be reinterpreted as another type, e.g.
x, y = int128(1234) as [2]int64
This allows zero-copy reinterpretation of a value’s bits, e.g.
req = buf as APIRequest
Transmutations are checked for safety at edit-time. When edit-time verification isn’t possible (e.g., with dynamic length slices), runtime checks ensure safe conversion.
Symbolic Programming
Iroh has first-class support for symbolic programming. New symbols are declared
using the sym
keyword, e.g.
sym x, y, z
These can then be used with functions from the sym
package in the standard
library to do things like symbolic differentiation, e.g.
import * from "sym"
sym x
expr = x³ + sin(x) + exp(x)
y = diff(expr, x)
y == 3x² + cos(x) + exp(x) // true
Symbolic integration, e.g.
sym x
integrate(x² * sin(x), x) // -x²*cos(x) + 2x*sin(x) + 2*cos(x)
Solve equations algebraically, e.g.
sym x
solve(x² + 5x + 6 == 0, x) // [-3, -2]
Do algebraic simplification, e.g.
sym x
simplify((x² - 1)/(x - 1)) // x + 1
Over time, more and more functions will be added to the standard library, so that Iroh is competitive with existing systems like Mathematica and Sympy.
Function Decorators
Similar to decorators in Python, Iroh allows decorators to extend or modify the behaviour of functions and methods in a clean, reusable, and expressive way, e.g.
app = http.Router{}
@app.get("/items/#{item_id}", {response: .json})
func get_item(item_id int) {
// fetch item from the database
return {"item_id": item_id, "item": item}
}
Decorators are evaluated at compile-time, enabling extensibility without any
runtime overhead. The built-in @decorator
specifies if a function or method
can be used as a decorator.
The first parameter to a decorator is always the function that is being decorated. The decorator can wrap the function or replace it entirely, e.g.
@decorator
func (r Router) get(handler function, path template, config Config) function {
// register path with the router and handle parameter conversion
return (handler.parameters) => {
response = handler(handler.parameters)
match config.response {
.json:
r.encode_json(response)
default:
// handle everything else
}
}
}
Software Transactional Memory
Iroh’s atomic
blocks provide an easy way to take advantage of software
transactional memory, e.g.
atomic {
if sender.balance >= amount {
sender.balance -= amount
receiver.balance += amount
}
risky_operation(sender)
}
In the example above, if risky_operation
should generate an error, then the
whole atomic
block will be rolled back, i.e. it’ll be as if the entire block
of code had never been run.
Atomic blocks naturally compose, e.g.
func transfer(sender Account, receiver Account, amount currency) {
atomic {
if sender.balance >= amount {
sender.balance -= amount
receiver.balance += amount
}
}
}
atomic {
transfer(alice, tav, 100)
transfer(tav, zaia, 150)
}
Behind the scenes, the use of atomic
essentially operates on “shadow”
variables. The “real” variables are only overwritten once the entire block
succeeds.
To make this efficient for certain data structures, e.g. large maps, the compiler will automatically switch them to use immutable variants that support efficient copies and rollbacks.
For even greater control, types can define @undo
variants of methods that will
be called in reverse order to roll back aborted changes, e.g.
func (d Document) insert_text(text string, pos int) {
...
}
@undo
func (d Document) insert_text(text string, pos int) {
// This variant automatically gets called to rollback changes.
}
Atomic blocks provide optimistic concurrency without needing any explicit locks. The outermost atomic block essentially forms a transaction, and all changes that it makes is tracked automatically.
If any two transactions conflict, i.e. concurrently write to the same values, then one will be automatically rolled back and retried indefinitely until it either completes or errors.
Instead of blindly retrying, transactions can be forced to retry only when the
values that they’ve read have actually changed. This can be controlled by using
the retry
keyword, e.g.
atomic {
if queue.is_empty() {
retry
}
job = queue.pop()
...
}
While transactions can roll back internal state changes, it’s generally impossible to roll back external effects such as writing output, e.g.
atomic {
if sender.balance < amount {
print("ERROR: Insufficient balance!")
}
}
In these cases, a good pattern is to only create side effects once an atomic
block has finished running, e.g.
result = atomic {
if sender.balance >= amount {
sender.balance -= amount
receiver.balance += amount
=> .successful_transfer
}
=> .insufficient_funds
}
if result == .insufficient_funds {
print("ERROR: Insufficient balance!")
}
Similarly, while atomic
blocks support asynchronous I/O calls, idempotency
keys should be used when calling external services, e.g.
idempotency_key = ...
atomic {
ok = <- call_transfer_api({amount, idempotency_key})
if ok {
mirrored.balance -= amount
}
}
Reactive Programming
Iroh supports reactivity natively. Any variables defined using :=
automatically updates whenever any variables that it depends on is updated, e.g.
a = 10
b = 20
c := a + b
b = 30
c == 40 // true
When a computed value can no longer be updated, i.e. when all the variables that it depends on are no longer updatable, it is no longer tracked, e.g.
func add(a int, b int) int {
c := a + b
return c // c is now untracked, as a and b can no longer be updated
}
To automatically perform side-effects whenever values are updated, the built-in
onchange
function can be given a handler function to run, e.g
a = 10
b = 20
c := a + b
onchange {
print("={c})")
}
b = 30 // Outputs: c = 40
b = 30 // No output as no change to affect computed value
The onchange
function returns a callback that can be used to explicitly remove
the registered handler, e.g.
a = 10
b = 20
c := a + b
remove_printer = onchange {
print("c = ${c})")
}
b = 40 // Outputs: c = 50
remove_printer()
b = 30 // No output as handler has been removed
As onchange
automatically tracks all the computed values that a handler
depends on, the handlers are automatically cleaned up whenever the values they
depend on are dropped.
This primitive is what powers many mechanisms in Iroh, e.g. our component-based UIs use this mechanism to automatically re-render the UI whenever dependent state values change.
When computed variables are defined, their definition is evaluated lazily, allowing for circular definitions, e.g.
temp_celsius := 0
temp_celsius := (temp_fahrenheit - 32) * 5/9
temp_fahrenheit := 32
temp_fahrenheit := (temp_celsius * 9/5) + 32
In such cases, the first :=
defines the initial value, and the second
definition defines how the value should be computed. This is cleaner than in
other languages and frameworks, e.g. Vue:
import { ref, computed } from 'vue'
const celsius = ref(0)
const fahrenheit = ref(32)
const celsiusComputed = computed({
get: () => (fahrenheit.value - 32) * 5/9,
set: (value) => fahrenheit.value = (value * 9/5) + 32
})
const fahrenheitComputed = computed({
get: () => (celsius.value * 9/5) + 32,
set: (value) => celsius.value = (value - 32) * 5/9
})
As the compiler knows which methods mutate underlying data structures,
reactivity works automatically for complex data structures like map
types,
e.g.
users = [{name: "Tav"}, {name: "Alice"}]
user_count := len(users)
users.append({name: "Zeno"})
user_count == 3 // true
As the compiler tracks dependencies and data flow, it’s able to optimally update values without much overhead, and in pretty much the same way that a developer would manually do so.
For certain operations, e.g. filtering a slice, the compiler is able to avoid unnecessary operations, e.g. by only filtering new elements and adding them to a computed value, e.g.
users = [{name: "Tav", admin: true}, {name: "Alice", admin: false}]
admins := users.filter { $0.admin == true }
// As users is mutated, admins would need to be recomputed.
// However, the filter is only run on the newly appended item
// to the slice.
users.append({name: "Zeno", admin: false})
This mechanism is applied even to nested collections, e.g.
users = [...]
number_of_active_admins := users
.filter { $0.admin }
.filter { len($0.recent_messages) > 0 }
Fields on struct
types can also be reactive, e.g.
Person = struct {
first_name string
last_name string
full_name := "${first_name} ${last_name}"
}
alice = Person{
first_name: "Alice"
last_name: "Fung"
}
alice.full_name == "Alice Fung" // true
Whenever a reactive computation can generate errors, the errors must be explicitly handled so that computed variables don’t need to propagate errors, e.g.
channel = "#espra"
msgs := try fetch_recent_messages(channel) or []
starred := msgs.filter { $0.starred }
The compiler will automatically batch updates based on dataflow analysis. This analysis is aware of I/O boundaries, so can atomically update changes even when dealing with async calls.
However, for the cases, where explicit control is needed, an atomic
block can
be used, e.g.
channel = "#espra"
atomic {
msgs := <- try fetch_recent_messages(channel) or []
starred := msgs.filter { $0.starred }
pinned := <- try fetch_pinned_messages(channel) or []
first_pinned := pinned.get(0)
}
Package Imports
Iroh supports referencing code from other packages using the import
keyword.
Like Go and JavaScript, we use static strings for referencing the package paths
which can be:
-
Tilde links to refer to packages on Espra:
import "~1.tav/some/package"
-
URL paths to refer to packages on some git repo that’s accessible over HTTPS:
import "github.com/tav/package"
-
Relative paths starting with either
./
or../
refer to local packages:import "./some/package"
-
Absolute paths to refer to packages within the standard library:
import "os/exec"
Imported packages are automatically aliased to a package identifier. This
defaults to the package_name
as specified by the package’s metadata, which in
turn defaults to the package’s file name.
Package names must satisfy the following regular expression:
^[a-z]([a-z0-9_]*[a-z0-9])?$
To support predictable naming, the package_name
must match one of the last two
slash-separated elements of the import path, e.g.
- If the package is imported as
"~1.tav/json/v2"
, then thepackage_name
must be eitherjson
orv2
.
An automatic conversion from hyphens in an import path to underscores in the
package_name
is also supported for those wanting to use hyphens for aesthetic
purposes, e.g.
import "github.com/tav/chart-widget" // imported as chart_widget
Likewise, any dots in the package_name
are converted to underscores, e.g.
import "~1.tav/package/csv.lib" // imported as csv_lib
In the case of conflicts, or even just for personal preference, imported
packages can be bound to a different package identifier by using the as
keyword, e.g.
import "~1.tav/json/v2" as fastjson
For importing non-code resources like images, custom importers can be specified
via the using
keyword on import
statements, e.g.
import "github.com/micrypt/image"
import "~1.tav/app/hero.jpg" using image.optimize
Custom importers need to implement the built-in importer
interface and are
evaluated at compile-time. They are passed the resource data and metadata from
the Espra tilde link.
As with all package dependencies, custom imports can also be updated by refreshing the package manifest, e.g. to fetch and use a newer version of a resource.
To use the same custom importer for multiple packages, the with
statement can
be used, e.g.
import "github.com/micrypt/bundle"
with {
.importer = bundle.assets
} {
import "~1.tav/logo/espra.png"
import "~1.tav/logo/ethereum.png"
import "~1.tav/logo/bitcoin.png"
}
Code within imported packages are referenced using dot syntax, e.g.
fastjson.encode([1, 2, 3])
Explicitly referencing the packages at call sites generally makes code easier to understand, rather than importing references from packages, e.g.
-
We believe that it’s easier to know what’s going on in:
import ( "github.com/tav/encode/json" "github.com/tav/encode/yaml" ) json.encode([1, 2, 3]) yaml.encode([1, 2, 3])
Than in something like:
use serde_json::to_string; use serde_yaml::to_string as to_string_yaml; to_string([1, 2, 3]) to_string_yaml([1, 2, 3])
However, there are certain use cases where constantly referencing the package name will add unnecessary noise, e.g. when referencing JSX-esque components, unit definitions, enums, etc.
For these cases, a special *
import form can be used, e.g.
import * from "github.com/tav/calendar"
By default, this will import all exported references from the package that starts with an upper case, any units, as well as import the package itself. This makes it cleaner to use some packages, e.g.
import * from "github.com/tav/calendar"
func App(initial_date string) {
selected_date = calendar.parse_rfc3339(initial_date)
return (
<CalendarWidget selected_date>
<Heading>"Event Start"</Heading>
</CalendarWidget>
)
}
Package authors can use the starexport
keyword to explicitly control what
exports will be available directly in other packages when they import via *
,
e.g.
starexport CalendarWidget, parse_rfc3339
In script
mode, it’ll be possible to import all exported identifiers into the
current package using the special **
import syntax, e.g.
import ** from "github.com/tav/math"
This feature can be enabled as a config option in script
mode. For example,
the Iroh REPL will have this always on, as it can get tedious to keep typing the
package name repeatedly inside the REPL, e.g.
// The standard approach involves a lot more typing in the REPL:
y = math.sqrt(math.sin(x) + math.cos(x))
// The ** imports make it shorter:
y = sqrt(sin(x) + cos(x))
If imported identifiers conflict in **
imports, identifiers from later imports
will override those with the same name from previous ones.
Visibility Control
Within packages, everything defined within the package’s global scope, i.e. variables, functions, types, and even the fields within the types, are fully visible to the rest of the code in the package.
However, outside of the package, i.e. in code that imports a package, visibility is constrained:
-
For a type value to be accessible, its identifier must start with a latin capital letter, i.e.
A
toZ
. -
For a non-type value to be accessible, i.e. a const value, function, a field within a public type, etc., its identifier must not start with an
_
.
For example, if a model
package defined:
supported_countries = ["China", "UK", "USA"]
_offensive_names = []
location = struct {
lat int
lng int
}
Person = struct {
name string
country string
_date_of_birth time.Date
}
func (p Person) is_over_21() bool {
...
}
func (p Person) _has_offensive_name() bool {
...
}
func _calc_age(p Person) int {
...
}
Then, in a package that imported it:
// Accessible
model.Person
model.supported_countries
person.name
person.country
person.is_over_21()
// Inaccessible
model.location
model._offensive_names
model._calc_age
person._date_of_birth
person._has_offensive_name()
Iroh’s approach reflects established norms within the programming community of
prefixing private fields with underscores, and avoids the need for public
and
private
visibility modifiers.
Environment Variables
Iroh provides built-in functions like $env.get
and $env.lookup
to get
the value of environment variable values, e.g.
log_level = $env.get("LOG_LEVEL")
Similarly, $env.set
can be used to set these values, e.g. when you need to
spawn external commands with that env value:
log_level = $env.set("LOG_LEVEL", "warn")
All environment variables values can be iterated over using $env.all
, e.g.
for env_name, env_value in $env.all() {
// Do something with each environment variable.
}
As this can get noisy in scripts, Iroh also provides syntactic sugar for
variables that start with $
and are followed by a sequence of upper case latin
letters, numbers, or underscores, e.g.
log_level = $LOG_LEVEL
Besides being a shortcut for getting environ values, they can also be used to update values easily, e.g.
$LOG_LEVEL = "warn"
Well-established environment variables which need to treated as lists, such as
$PATH
and $CFLAGS
are transparently converted into string slices, e.g.
$PATH == [
"/usr/local/bin",
"/usr/bin",
"/bin",
"/usr/sbin",
"/sbin"
]
When this value gets manipulated, the underlying environment variable gets updated, e.g.
$PATH.prepend("/home/tav/bin")
These can be cast to a string
to get the encoded form, e.g.
path = string($PATH)
path == "/home/tav/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin" // true
Other well-established environment variables are also appropriately typed, e.g.
$VERIFY_SSL
and $NO_COLOR
are treated as booleans, $HTTPS_PORT
and
$TIMEOUT
are treated as integers, etc.
This typed nature allows for default values to be set easily, e.g.
timeout = $TIMEOUT or 60
Environment variable values are of the environ
data type. Custom registrations
can be defined at compile-time, e.g.
environ.register("ESPRA_DEBUG", type: bool)
environ.register("KICKASS_IMPORT_PATH", type: []string, delimiter: ";")
Any $
references to those environment variable names will then be treated as
expected, e.g.
$KICKASS_IMPORT_PATH.prepend("/home/tav/kickass")
string($KICKASS_IMPORT_PATH) == "/home/tav/kickass;/usr/local/kickass" // true
By default, changes to environment variables are applied globally and inherited
by all sub-processes. To limit a value to a specific lexical scope, the with
statement can be used, e.g.
with {
$TIMEOUT = 120
} {
// Execute external commands.
}
System & Process Info
Common system and process-related info can also be found in some $
-prefixed
variables:
-
$arch
- The CPU architecture, e.g.
arm64
,x64
, etc.
- The CPU architecture, e.g.
-
$args
- List of command-line arguments without the binary and script names.
-
$argv
- List of command-line arguments including the binary and script names.
-
$available_memory
- Currently available memory in bytes.
-
$boot_time
- Timestamp of when the system was last booted.
-
$cloud_info
- Access cloud metadata (on supported platforms).
-
$container_info
- Access container metadata and runtime info (on supported platforms).
-
$cpu_count
- Number of available CPU cores/threads.
-
$cpu_info
- Details on the system CPUs.
-
$cwd
- The current working directory.
-
$debug_build
- Whether this is a debug or release build.
-
$disk_info
- Details on the system disks.
-
$effective_gid
- The effective group ID (on supported platforms).
-
$effective_uid
- The effective user ID (on supported platforms).
-
$env
- Access environment variables.
-
$exit_code
- Exit code of the last executed command/process.
-
$groups
- List of group IDs that the user belongs to (on supported platforms).
-
$gpu_info
- Details on the system GPUs.
-
$interactive
- Boolean indicating whether the process is running within an interactive session.
-
$iroh_mode
- The current Iroh execution mode.
-
$iroh_version
- Version of the Iroh runtime/compiler.
-
$locale
- The current locale setting.
-
$home
- The current user’s home directory path.
-
$hostname
- The hostname of the system.
-
$machine_id
- Persistent identifier for the machine.
-
$max_memory
- Memory limit for the process.
-
$max_open_files
- File descriptor limit for the process.
-
$max_processes
- Processes limit for the process.
-
$mem_info
- Details on the system memory.
-
$network_info
- Details on the system network interfaces.
-
$os
- The current operating system, e.g.
linux
,macos
,windows
, etc.
- The current operating system, e.g.
-
$page_size
- The memory page size of the underlying system.
-
$parent_pid
- The parent process ID.
-
$pid
- The process ID of the current process.
-
$process_limits
- Details on any limits that apply to the current process.
-
$process_start_time
- Timestamp of when the current process started.
-
$real_gid
- The real group ID of the current process (on supported platforms).
-
$real_uid
- The real user ID of the current process (on supported platforms).
-
$saved_gid
- The saved group ID for privilege restoration (on supported platforms).
-
$saved_uid
- The saved user ID for privilege restoration (on supported platforms).
-
$session_id
- The process session ID.
-
$stderr_tty
- Checks whether standard error is attached to a TTY.
-
$stdin_tty
- Checks whether standard input is attached to a TTY.
-
$stdout_tty
- Checks whether standard output is attached to a TTY.
-
$term_colors
- Number of colours supported by the terminal.
-
$term_height
- The current terminal height.
-
$term_info
- Details about the terminal capabilities and type.
-
$term_width
- The current terminal width.
-
$temp_dir
- The default root directory for temporary files.
-
$timezone
- The system timezone.
-
$total_memory
- Total memory in bytes.
-
$user
- The current user’s username.
-
$user_cache_dir
- The default root directory for user-specific cache data.
-
$user_config_dir
- The default root directory for user-specific config data.
-
$virtualization_info
- Details about the virtualization environment (on supported platforms).
Shell Interaction
Iroh provides programmatic access to running external commands via the os/exec
package in the standard library. It also provides various syntactic sugar to
make this easier.
An exec.Cmd
value, representing an external command to run can be constructed
by prefixing a slice of strings with $
, e.g.
cmd = $["git", "checkout", commit]
type(cmd) == exec.Cmd // true
The returned value can be configured as needed, e.g. to set custom environment variables, use a custom reader for the command’s stdin, and a custom writer as its stderr:
cmd = $["ffmpeg", "-i", "-", "-c:v", "libx264", "-f", "mp4", "-"]
cmd.env = {
"AV_LOG_LEVEL": "verbose"
}
cmd.stdin = mkv_file
cmd.stderr = err_buf
To inherit the current environment variable values, $env.with
can be used,
e.g.
cmd.env = $env.with {
"AV_LOG_LEVEL": "verbose"
}
Methods on exec.Cmd
values allow for fine-grained control over command
execution, e.g.
-
cmd.output
— start the command, wait for it to finish, and return the contents of its standard output. -
cmd.run
— start the command and wait for it to finish. -
cmd.start
— start the command without waiting for it to finish.
A started command has additional methods, e.g.
-
cmd.memory_usage
— details about the memory used by the process. -
cmd.pid
— the process identifier. -
cmd.wait
— wait for the started command to finish.
Iroh provides shell-like syntax for running commands, piping output, etc. These can be executed by quoting the command within backticks, e.g.
commit = `git rev-parse HEAD`
This starts the given command, waits for it to finish, and if successful, i.e.
gets a 0
exit code from the sub-process, returns the standard output after
trimming.
Like most shells, whitespace is treated as a separator between arguments, and need to normally be escaped, e.g.
output = `/home/tav/test\ code/chaos-test --threads 4`
The \
escape of the whitespace in the command above makes it equivalent to:
cmd = $["/home/tav/test code/chaos-test", "--threads", "4"]
output = cmd.output()
The usual '
and "
quote marks can be used to avoid the need for escaped
whitespace, e.g.
output = `chaos-test --select "byzantine node"`
String interpolation can be used within backtick commands, e.g.
output = `git checkout ${commit}`
Values being interpolated are escaped automatically. If it’s a slice of strings, then it is treated as multiple space separated arguments. Otherwise, as a string value.
This helps to prevent a range of security vulnerabilities, e.g.
dangerous_input = "'; rm -rf /; echo '"
output = `echo ${dangerous_input}` // Safely escaped.
While still allowing for multiple arguments to be passed in safely, e.g.
files = ["file 1.md", "file 2.md"]
output = `cat ${files}` // Becomes: cat "file 1.md" "file 2.md"
Commands can be pipelined, e.g.
output = `cat file.txt | grep "Error"`
Outputs can also be redirected, e.g. to write to a file:
`cat file1.txt file2.txt > new.txt`
Or to append to a file:
`cat file1.txt file2.txt >> new.txt`
By default, only the standard output is redirected. As most shells, like Bash, use complex syntax for controlling what gets redirected or piped, e.g.
ls /nonexistent 2>&1 1>/dev/null | grep "No such"
Iroh uses more explicit @keywords
in front of the |
pipe, or >
and
>>
redirect operators, e.g.
`ls /nonexistent @stderr | grep "No Such"`
This takes the following values for piping or redirecting:
-
@all
— all streams, including standard output and error. -
@both
— both the standard output and error. -
@stderr
— just the standard error. -
@stdout
— just the standard output (default behaviour). -
@stream:N
— a specific file descriptor, e.g.@stream:3
.
Iroh doesn’t support input redirection or heredocs within backtick commands as we believe that linear pipelines are easier to understand, i.e. when they go left to right.
Instead, when data needs to be piped in, the output of cat
can be used, or any
suitably typed value, i.e. a string
, []byte
, or io.Reader
, can be piped
into a backtick command using |
, e.g.
mp4_file = mkv_file | `ffmpeg -i - -c:v libx264 -f mp4 -`
This acts as syntactic sugar for:
cmd = $["ffmpeg", "-i", "-", "-c:v", "libx264", "-f", "mp4", "-"]
cmd.stdin = mkv_file
mp4_file = cmd.output()
Likewise, output can be redirected to an io.Writer
, e.g.
`git rev-parse HEAD` > file("commit.txt")
// Or even appended:
`git rev-parse HEAD` >> file("commits.txt")
Conditional execution within backtick commands can be controlled using and
and
or
, e.g.
`command1 and command2` // Only run command2 if command1 succeeds.
`command1 or command2` // Only run command2 if command1 fails.
Iroh supports automatic globbing when wildcard patterns are specified, e.g.
output = `cat *.log`
The following syntax is supported for globbing:
-
*
— matches any string, including empty. -
?
— matches any single character. -
[abc]
— matches any one character in the set. -
[a-z]
— matches any one character in the range. -
{foo,bar}
— matches alternates, i.e. eitherfoo
orbar
. -
**
— matches directories recursively.
For example:
Pattern | Example Matches |
---|---|
*.md |
iroh.md , _doc.md |
file?.md |
file1.md , fileA.md |
[abc]*.py |
a.py , car.py |
[a-z]*.py |
a.py , car.py , test.py |
{foo,bar}.sh |
foo.sh , bar.sh |
**/*.ts |
lib/main.ts , tests/fmt_test.ts |
When the '
single quote is used within backtick commands, globbing is not
applied, e.g.
output = `cat log | grep '*error'`
In the interests of safety and predictability, globbing is also not applied to any interpolated values, e.g.
filename = `*.log`
output = `cat ${filename}`
If explicit globbing is desired, then the built-in glob
function can be used,
e.g.
files = glob(`*.log`)
output = `cat ${files}`
This can also be used for quickly iterating over matching patterns, e.g.
for file in glob("*.md") {
// Do something with each of the Markdown files.
}
Iroh supports command substitution in arguments, e.g.
output = `echo ${`date`}`
But since the interpolated value is treated as a single argument, something like the following won’t work, e.g.
output = `grep "ERROR" ${`find . -name "*.log"`}`
The inner command output will need to be turned into a slice of strings first, e.g.
output = `grep "ERROR" ${`find . -name "*.log"`.split("\n")}`
Or, even better:
files = `find . -name "*.log"`.split("\n")
output = `grep "ERROR" ${files}`
We believe this makes code more readable. It also makes life safer than in
shells like Bash where inputs can cause unexpected outcomes depending on the
IFS
value and how it’s quoted.
Errors generated when running commands, e.g. when they return a non-zero exit
code, can be handled by using the try
keyword, e.g.
output = try `ls /nonexistent`
If, instead of generating errors, an explicit response object is preferred, then
the backtick command can be prefixed with a $
. This returns a value with the
exit_code
, stdout
, stderr
, etc.
response = $`ls /nonexistent`
response.exit_code == 1 // true
By default, all backtick commands are waited on to finish running. The &
operator can be used after a backtick command to return a background job
instead, e.g.
job = `sleep 10` &
Backgrounded processes can be signalled with platform-supported signals like
.sigterm
, e.g.
if job.is_running() {
job.signal(.sigkill)
}
If a command needs to be run so that the user can directly type in any input,
and see the output as it happens, then the $()
form can be used:
$(ls -al)
Besides calling external processes, Iroh also supports running local commands defined within Iroh. These need to satisfy the interface:
localcmd = interface {
__init__((stdin: io.Reader, stdout: io.Writer, stderr: io.Writer))
__call__(args ...string) exit_code
}
They can be registered with a specific name, e.g.
localcmd.register("chaos-test", ChaosTest)
And can then used like any external command, e.g.
output = `chaos-test --threads 4`
Certain built-in commands like cd
are implemented like this, and can thus also
be called as plain functions, e.g.
cd("silo/espra")
commit = `git rev-parse HEAD`
As changing working directories is a common need in shell scripting, the with
statement can be used to change the working directory for the lexical scope,
e.g.
with {
.cwd = "silo/espra"
} {
// Do things in the silo/espra sub-directory here.
}
Finally, in script
mode, if interactive shell support is enabled, Iroh allows
for shell commands to be run without needing to be encapsulated within $()
,
e.g.
cd silo/espra
git rev-parse HEAD > commit.txt
This works by:
-
First, trying to interpret a line as if it were non-shell code.
-
Otherwise, it tries to treat the line as if it were encapsulated within
$()
. -
If neither succeeds, an error is generated.
For example:
cd silo/espra
commit = `cat commit.txt`
if commit.starts_with("abcdef") {
// Celebrate!
}
This allows the “normal” programming aspects of the Iroh language to be seamlessly interwoven with shell code within scripts, the Iroh REPL/Shell, etc.
C Interoperability
Iroh aims to match the high bar set by Zig for C interoperability with zero overhead. Like Zig, we ship with a C compiler and linker so that C code can be imported and used just like Iroh packages, e.g.
import "github.com/micrypt/glfw-iroh" as c
if c.glfwInit() != 0 {
...
}
window = c.glfwCreateWindow(800, 600, "Hello GLFW from Iroh", nil, nil)
if window == nil {
...
}
c.glfwMakeContextCurrent(window)
for c.glfwWindowShouldClose(window) == 0 {
...
}
While complex macros don’t get translated, constants using #define
get
imported automatically, e.g.
import "github.com/tav/limits" as c
// C: #define INT_MAX 2147483647
max_int = c.INT_MAX
Iroh supports a number of data types that match whatever the C compiler would produce for a target platform:
Iroh | C Type | Typical Size |
---|---|---|
c_char | char | 8 bits |
⤷ Platform-dependent signedness | ||
c_schar | signed char | 8 bits |
c_uchar | unsigned char | 8 bits |
c_short | short int | 16 bits |
c_ushort | unsigned short int | 16 bits |
c_int | int | 32 bits |
c_uint | unsigned int | 32 bits |
c_long | long int | 32/64 bits |
⤷ 32-bit on Windows, 64-bit on Unix | ||
c_ulong | unsigned long int | 32/64 bits |
⤷ 32-bit on Windows, 64-bit on Unix | ||
c_longlong | long long int | 64 bits |
c_ulonglong | unsigned long long int | 64 bits |
c_size_t | size_t | Pointer size |
⤷ Same as uint typically | ||
c_ssize_t | ssize_t | Pointer size |
⤷ Same as int typically | ||
c_ptrdiff_t | ptrdiff_t | Pointer size |
⤷ For pointer arithmetic | ||
c_float | float | 32 bits |
c_double | double | 64 bits |
c_longdouble | long double | 64/80/128 bits |
⤷ Platform-dependent precision | ||
c_string | char* | - |
⤷ Alternatively: [*]c_char | ||
!c_string | const char* | - |
⤷ Alternatively: ![*]c_char | ||
c_union | union | - |
c_void | void | - |
opaque | void* | - |
C function signatures can be specified with an extern
and called directly from
Iroh code, e.g.
extern func process_data(buf [*]c_char, len c_size_t, scale c_double) c_int
result = process_data(buf, len(buf), 1.23)
Variadic functions can be called as expected, e.g.
extern func printf(format: c_string, ...) c_int
printf("Hello %s, number: %d\n", "world", 42)
In order to match ABI compatibility with C without any overhead, C code can only
be called from within the .single_threaded
and .multi_threaded
schedulers
with non-gc allocators.
In those instances, there is no marshalling overhead and the native C calling convention is followed without any runtime interference, e.g.
// This compiles to identical assembly as C.
func add(a c_int, b c_int) c_int {
return a + b
}
To support callbacks from C, the type signature of Iroh functions can specify that they use the C calling convention, e.g.
extern func register_callback(cb func(c_int) callconv(.c) c_void) c_void
func my_callback(x c_int) callconv(.c) c_void {
// Handle callback value.
}
register_callback(my_callback)
To match C’s memory layout for structs, the extern struct
keyword needs to be
used, e.g.
Point = extern struct {
x c_int
y c_int
}
The c_union
data type matches unions in C, e.g.
Value = c_union {
i c_int
f c_float
bytes [4]c_char
}
These are untagged and you must know which field is active. Accessing an inactive field can result in garbage data or even trigger a hardware trap, e.g.
Value = c_union {
i c_int
f c_float
bytes [4]c_char
}
v = Value{i: 42} // The .i field is active
x = v.i // This is OK
y = v.f // This is not OK
While the compiler can detect certain accesses as unsafe and generate edit-time errors, this may not always be possible, e.g. when calling external libraries, and will thus be marked as unsafe.
Alignment can be forced if needed with align
, e.g.
Point = extern struct {
x c_int align(16)
y c_int
}
Exact bit control can be done using packed
, e.g.
ColorWriteMaskFlags = extern struct(uint32, packed) {
red: false,
green: false,
blue: false,
alpha: false,
}
The mem.c_allocator
can be used to use the C allocator from Iroh, e.g.
with {
ctx.allocator = mem.c_allocator()
} {
...
}
As calling C code is inherently unsafe, Iroh makes limited safety guarantees when C code is called:
-
If the C code being called was compiled by Iroh and is well-defined and does not cause any undefined behaviour according to the C23 standard, it is marked as safe.
-
Otherwise, the call to C is marked as unsafe. Such unsafe code will not be allowed in contexts like
onchain-script
mode, and will need to be explicitly approved otherwise.
Relatedly, as the [*]
C-style pointers can be null pointers, they explicitly
need to be checked for nil
before potentially unsafe operations like accessing
members, indexing, etc.
Finally, like Zig, Iroh also ships with multiple sets of libc headers that allows for easy cross-compilation for various target platforms.
Dynamically Generating Assembly
Compilers are not perfect. There will always be edge cases where a developer will be able to get better performance from a machine by writing raw assembly themselves.
Iroh provides a genasm
keyword and an associated asm
package that provides
rich support for generating assembly code programmatically. These can be used
inside function bodies, e.g.
import * from "asm"
func popcount(v uint32) uint32 {
result = uint32(0)
genasm {
XORL(result, result)
loop:
TESTL(x, x)
JZ(:done)
SHRL(x, 1)
ADCL(result, 0)
JMP(:loop)
done:
}
return result
}
When genasm
blocks are at the top-level of a package, they are used to define
functions in assembly code and need to specify the complete function signature.
While languages like C, C++, D, Rust, and Zig support inline assembly, Iroh lets
you generate assembly code at compile-time using standard control structures
such as if
conditions and for
loops.
This makes dealing with assembly code a lot easier, e.g. these 100 lines of code, ported from avo, generate the 1,500 lines of assembly code to support SHA-1 hashing:
import * from "asm"
genasm block(h *[5]uint32, m []byte) {
w = asm.stack_alloc(64)
w_val = (r) => w.offset((r % 16) * 4)
asm.comment("Load initial hash.")
hash = [GP32(), GP32(), GP32(), GP32(), GP32()]
for i, r in hash {
MOVL(h.offset(4*i), r)
}
asm.comment("Initialize registers.")
a, b, c, d, e = GP32(), GP32(), GP32(), GP32(), GP32()
for i, r in [a, b, c, d, e] {
MOVL(hash[i], r)
}
steps = [
(f: choose, k: 0x5a827999),
(xor, 0x6ed9eba1),
(majority, 0x8f1bbcdc),
(xor, 0xca62c1d6),
]
for r in 0..79 {
asm.comment("Round ${r}.")
s = steps[r/20]
// Load message value.
u = GP32()
if r < 16 {
MOVL(m.offset(4*r), u)
BSWAPL(u)
} else {
MOVL(w_val(r-3), u)
XORL(w_val(r-8), u)
XORL(w_val(r-14), u)
XORL(w_val(r-16), u)
ROLL(U8(1), u)
}
MOVL(u, w_val(r))
// Compute the next state register.
t = GP32()
MOVL(a, t)
ROLL(U8(5), t)
ADDL(q.f(b, c, d), t)
ADDL(e, t)
ADDL(U32(q.k), t)
ADDL(u, t)
// Update registers.
ROLL(Imm(30), b)
a, b, c, d, e = t, a, b, c, d
}
asm.comment("Final add.")
for i, r in [a, b, c, d, e] {
ADDL(r, hash[i])
}
asm.comment("Store results back.")
for i, r in hash {
MOVL(r, h.offset(4*i))
}
RET()
}
func choose(b, c, d Register) Register {
r = GP32()
MOVL(d, r)
XORL(c, r)
ANDL(b, r)
XORL(d, r)
return r
}
func majority(b, c, d Register) Register {
t, r = GP32(), GP32()
MOVL(b, t)
ORL(c, t)
ANDL(d, t)
MOVL(b, r)
ANDL(c, r)
ORL(t, r)
return r
}
func xor(b, c, d Register) Register {
r = GP32()
MOVL(b, r)
XORL(c, r)
XORL(d, r)
return r
}
As assembly generating code within Iroh can be packaged up and reused, we expect
genasm
to be more heavily used than inline assembly in other languages.
In particular, we expect to see genasm
used in performance-critical code, e.g.
to take advantage of specific SIMD instructions when the compiler can’t
automatically vectorize code.
The asm
support package takes inspiration from projects like PeachPy, AsmJit,
and avo to enable programmatic assembly code generation, and takes care of some
complex aspects, e.g.
-
Supporting unlimited virtual registers which are transparently mapped to physical registers.
-
Automatically taking care of correct memory offsets for complex data structures.
Initially, this package will support currently popular architectures:
-
x64
, i.e. x86-64, the 64-bit version of Intel’s x86 architecture as developed by AMD. -
arm64
, i.e. AArch64, the 64-bit version of the ARM architecture.
Assembly generating code can match on $arch
to generate different assembly
code for different architectures:
match $arch {
.x64:
// x64-specific assembly code
.arm64:
// arm64-specific assembly code
default:
// everything else
}
Support will be added over time for instructions in newer versions of architectures, as well as other architectures as they gain adoption within the broader market.
To LLVM or Not
For many new languages, LLVM has been the go-to choice for compiler infrastructure. Rust. Julia. Swift. They all use LLVM to generate and optimize machine code.
This is with good reason. LLVM is hard to match. It has battle-tested code generation for multiple architectures, with decades of work on code optimization!
But, while LLVM is definitely an amazing piece of engineering, we will not be using it to build Iroh’s official compiler:
-
LLVM is slow moving. For almost a decade, the Rust team had to maintain and ship their own fork of LLVM as the official releases didn’t include the features and bug fixes that they needed.
-
It imposes a lot of cost on language designers, e.g. working around the constraints imposed by LLVM, working around the regressions introduced by new versions, etc.
-
LLVM was never built with developer productivity in mind. If you want your language to have features like fast compilation, LLVM is painful to work around.
Instead, we will be following in the footsteps of Go, and more recently Zig, and do the generation and optimization of machine code ourselves.
-
For starters, to support most use cases, we only need to support 6 platforms at the start:
android-arm64
,ios-arm64
,linux-arm64
,linux-x64
,macos-arm64
,windows-x64
. -
While hardware speeds double roughly every 18 months, progress in compiler optimizations is much slower — you only get speed doublings every few decades.
-
After all, Frances Allen catalogued most of the big impact optimizations way back in 1971: inlining, unrolling, CSE, code motion, constant folding, DCE, and peephole optimization.
-
As evidenced by Go, even a simple compiler can produce reasonably fast code. Besides improving developer productivity, we’re also likely to have fewer miscompilations.
-
Most code in an executable is only run a few times. By providing our users with the ability to dynamically generate assembly, they’d be able to optimize hotspots better than most compilers.
So, while it’ll be a challenge to build a decent code generator, we believe the payoff will be well worth it.