Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion docs/rvm/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,13 @@ include formatted state snapshots where possible.
7. **Host await**: In run-to-completion mode, `HostAwait` consumes a response
from `host_await_responses`. Suspendable mode yields control with a
`SuspendReason::HostAwait { dest, argument, identifier }` that the host must
service.
service. The compiler supports two ways to emit `HostAwait`:
- **Explicit**: `__builtin_host_await(payload, identifier)` — raw 2-argument
form.
- **Registered**: `compile_from_policy_with_host_await` accepts a list of
`(name, arg_count)` pairs. Calls to registered names are compiled as
`HostAwait` with the function name as the identifier literal. Registered
names take precedence over user-defined functions and standard builtins.
8. **Completion**: `Return` wraps the selected register value into
`InstructionOutcome::Return`, unwinding frames until the entry frame is
cleared. `RuleReturn` is a specialised variant used by rule execution
Expand Down
44 changes: 44 additions & 0 deletions docs/rvm/instruction-set.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,50 @@ Parameter tables:
- Suspendable: emits `InstructionOutcome::Suspend` with `SuspendReason::HostAwait`.
The host must resume with a value that will be written into `dest`.

### Registered host-await builtins

The compiler can be configured with a list of function names that map directly
to `HostAwait` instructions. This allows policy authors to write natural
function calls (e.g. `lookup(input.account_id)`) instead of the raw
`__builtin_host_await(payload, identifier)` builtin.

Registration is done at compile time via `Compiler::compile_from_policy_with_host_await`:

```rust
let builtins = [("lookup", 1), ("persist", 1)];
let program = Compiler::compile_from_policy_with_host_await(
&compiled_policy, &entry_points, &builtins,
)?;
```

Each registered name is a `(name, arg_count)` pair. When the compiler
encounters a call to a registered name, it emits a `HostAwait` instruction
with:
- `arg` = the first argument register
- `id` = a register loaded with a string literal containing the function name

Both the explicit `__builtin_host_await(arg, id)` call and a registered
builtin call produce the **same `HostAwait` bytecode instruction**. The only
difference is how the `id` register is populated: explicit calls take it from
the second user-supplied argument, while registered calls auto-generate a
`Load` instruction for the function name string. The VM cannot distinguish
between the two at runtime.

**Resolution order** in `determine_call_target()`:
1. `__builtin_host_await` (magic 2-argument form)
2. Registered host-await builtins (matched by bare function name)
3. User-defined functions (matched by package-qualified path)
4. Standard builtins (matched by bare function name)

Registered names shadow both user-defined functions and standard builtins.
This means `time.parse_duration_ns` can be overridden to route through the
host instead of the built-in Rust implementation.

**Argument handling**: The `HostAwait` instruction carries a single `arg`
register. Registered builtins must use `arg_count: 1`; the compiler rejects
`arg_count > 1` at registration time. To pass multiple values, use object
packing: `lookup({"user": x, "resource": y})`.

---

## Halt instruction
Expand Down
78 changes: 69 additions & 9 deletions src/languages/rego/compiler/function_calls.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,20 @@ use crate::lexer::Span;
use crate::rvm::instructions::{BuiltinCallParams, FunctionCallParams};
use crate::rvm::Instruction;
use crate::utils::get_path_string;
use alloc::{format, string::ToString, vec::Vec};
use crate::value::Value;
use alloc::{
format,
string::{String, ToString},
vec::Vec,
};

/// Resolved destination of a Rego function-call expression. Produced by
/// [`Compiler::determine_call_target`] and consumed by
/// [`Compiler::compile_function_call`] to choose which instruction to emit.
/// Carrying the discrimination in the type (rather than re-matching on a
/// magic name at the emit site) keeps the host-await handling honest under
/// future refactors — the compiler will refuse to build if a new variant is
/// added without updating every match site.
enum CallTarget {
User {
rule_index: u16,
Expand All @@ -28,9 +40,14 @@ enum CallTarget {
builtin_index: u16,
expected_args: Option<usize>,
},
HostAwait {
expected_args: Option<usize>,
},
/// Explicit `__builtin_host_await(arg, id)` call form (2 user args).
/// The identifier is supplied by the policy author at runtime via the
/// second argument register.
ExplicitHostAwait,
/// A registered host-awaitable builtin invoked by its registered name
/// (1 user arg). The identifier is the registered name itself and is
/// baked into the bytecode as a string literal at compile time.
RegisteredHostAwait { identifier: String },
}

impl<'a> Compiler<'a> {
Expand Down Expand Up @@ -59,7 +76,11 @@ impl<'a> Compiler<'a> {
let expected_args = match &call_target {
CallTarget::User { expected_args, .. } => *expected_args,
CallTarget::Builtin { expected_args, .. } => *expected_args,
CallTarget::HostAwait { expected_args } => *expected_args,
// Both host-await variants have a known fixed arity; carrying it
// in the variant lets the rest of the compiler depend on the type
// rather than re-matching on the magic name `__builtin_host_await`.
CallTarget::ExplicitHostAwait => Some(2),
CallTarget::RegisteredHostAwait { .. } => Some(1),
};

if let Some(expected) = expected_args {
Expand Down Expand Up @@ -126,7 +147,8 @@ impl<'a> Compiler<'a> {
});
self.emit_instruction(Instruction::BuiltinCall { params_index }, &span);
}
CallTarget::HostAwait { .. } => {
CallTarget::ExplicitHostAwait => {
// Explicit __builtin_host_await(arg, id) — 2 arguments
if arg_regs.len() != 2 {
return Err(CompilerError::General {
message: format!(
Expand All @@ -136,7 +158,6 @@ impl<'a> Compiler<'a> {
}
.at(&span));
}

self.emit_instruction(
Instruction::HostAwait {
dest,
Expand All @@ -146,6 +167,37 @@ impl<'a> Compiler<'a> {
&span,
);
}
CallTarget::RegisteredHostAwait { identifier } => {
// Registered host-awaitable builtin — the identifier is the
// registered name and is baked into the bytecode as a literal.
if arg_regs.len() != 1 {
return Err(CompilerError::General {
message: format!(
"host-awaitable builtin '{}' expects exactly 1 argument, got {}",
identifier,
arg_regs.len()
),
}
.at(&span));
}
let id_reg = self.alloc_register();
let literal_idx = self.add_literal(Value::String(identifier.into()));
self.emit_instruction(
Instruction::Load {
dest: id_reg,
literal_idx,
},
&span,
);
self.emit_instruction(
Instruction::HostAwait {
dest,
arg: arg_regs[0],
id: id_reg,
},
&span,
);
}
}

if let Some((plan, plan_span)) = &out_param_plan {
Expand Down Expand Up @@ -187,8 +239,16 @@ impl<'a> Compiler<'a> {
span: &Span,
) -> Result<CallTarget> {
if original_fcn_path == "__builtin_host_await" {
return Ok(CallTarget::HostAwait {
expected_args: Some(2),
return Ok(CallTarget::ExplicitHostAwait);
}

// Check registered host-awaitable builtins. Registered builtins are
// restricted to arg_count == 1 at registration time (see
// `Compiler::register_host_await_builtin`), so the variant doesn't
// need to carry an arity — it's fixed at 1.
if self.host_await_builtins.contains_key(original_fcn_path) {
return Ok(CallTarget::RegisteredHostAwait {
identifier: original_fcn_path.to_string(),
});
}

Expand Down
59 changes: 59 additions & 0 deletions src/languages/rego/compiler/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,9 @@ use crate::rvm::program::{Program, RuleType, SpanInfo};
use crate::value::Value;
use crate::CompiledPolicy;
use alloc::collections::{BTreeMap, BTreeSet};
use alloc::format;
use alloc::string::String;
use alloc::string::ToString as _;
use alloc::vec;
use alloc::vec::Vec;
use indexmap::IndexMap;
Expand Down Expand Up @@ -139,6 +141,10 @@ pub struct Compiler<'a> {
current_call_stack: Vec<u16>,
entry_points: IndexMap<String, usize>,
soft_assert_mode: bool,
/// Registered host-awaitable builtins: name → expected arg count.
/// When the compiler encounters a call to one of these names, it emits a
/// `HostAwait` instruction instead of a regular function or builtin call.
host_await_builtins: BTreeMap<String, usize>,
}

impl<'a> Compiler<'a> {
Expand Down Expand Up @@ -173,9 +179,62 @@ impl<'a> Compiler<'a> {
current_call_stack: Vec::new(),
entry_points: IndexMap::new(),
soft_assert_mode: false,
host_await_builtins: BTreeMap::new(),
}
}

/// Register a function name as a host-awaitable builtin.
///
/// When the compiler encounters a call to `name(arg)`, it will emit a
/// `HostAwait` instruction with the argument and `name` as the identifier,
/// instead of treating it as a user-defined or standard builtin function.
///
/// `arg_count` must be exactly 1. The `HostAwait` instruction carries a
/// single argument register; use object packing to pass multiple values
/// (e.g. `name({"key1": v1, "key2": v2})`).
///
/// Returns `Err` when:
/// - `name` is the reserved identifier `__builtin_host_await`,
/// - `name` is empty or only whitespace,
/// - `name` is already registered (duplicate registration is rejected
/// rather than silently overwritten),
/// - `arg_count` is not exactly 1.
pub fn register_host_await_builtin(&mut self, name: &str, arg_count: usize) -> Result<()> {
if name == "__builtin_host_await" {
return Err(CompilerError::General {
message: "__builtin_host_await is a reserved name and cannot be registered as a host-await builtin"
.to_string(),
}
.into());
}
if name.trim().is_empty() {
return Err(CompilerError::General {
message: "host-await builtin name must not be empty or whitespace".to_string(),
}
.into());
}
if self.host_await_builtins.contains_key(name) {
return Err(CompilerError::General {
message: format!(
"host-await builtin '{name}' is already registered; \
duplicate registration is not allowed"
),
}
.into());
}
if arg_count != 1 {
return Err(CompilerError::General {
message: format!(
"registered host-await builtin '{name}' must have arg_count == 1, got {arg_count}. \
Use object packing to pass multiple values."
),
}
.into());
}
self.host_await_builtins.insert(name.to_string(), arg_count);
Ok(())
}

pub(super) fn with_soft_assert_mode<F, R>(&mut self, enabled: bool, f: F) -> R
where
F: FnOnce(&mut Self) -> R,
Expand Down
12 changes: 12 additions & 0 deletions src/languages/rego/compiler/rules.rs
Original file line number Diff line number Diff line change
Expand Up @@ -181,8 +181,20 @@ impl<'a> Compiler<'a> {
pub fn compile_from_policy(
policy: &CompiledPolicy,
entry_points: &[&str],
) -> Result<Arc<Program>> {
Self::compile_from_policy_with_host_await(policy, entry_points, &[])
}

/// Compile from a CompiledPolicy to RVM Program with registered host-awaitable builtins.
pub fn compile_from_policy_with_host_await(
policy: &CompiledPolicy,
entry_points: &[&str],
host_await_builtins: &[(&str, usize)],
Comment thread
anakrish marked this conversation as resolved.
) -> Result<Arc<Program>> {
let mut compiler = Compiler::with_policy(policy);
for &(name, arg_count) in host_await_builtins {
compiler.register_host_await_builtin(name, arg_count)?;
}
compiler.current_rule_path = "".to_string();
let rules = policy.get_rules();

Expand Down
Loading