Introduction
Boomslang runs CPython 3.14 from a WASI build. The default artifact embeds that runtime in Java through Chicory, so Python runs inside the JVM without JNI, subprocesses, or a system Python install.
Python code executes in a fully sandboxed WebAssembly memory space: it sees only the filesystem you give it, calls only the host functions you register, and a misbehaving script cannot take down the JVM.
What’s in the box
The default Maven artifact ships with:
- CPython 3.14 built for
wasm32-wasip1 - the Python stdlib plus NumPy, Pandas, Matplotlib, Pillow, Pydantic, ijson, and Jinja2
python/bin/boomslang.wasm, the runtime module- generated Chicory AOT classes, so the WASM runs as compiled JVM bytecode instead of being interpreted
- copy-on-write memory snapshots: the interpreter is pre-initialized at build time (Wizer), so creating a
PythonInstanceis a memory copy measured in milliseconds, not a full CPython startup boomslang_host, a small Python-side bridge for calling host functions
Supported hosts
A host is the outside process embedding boomslang.wasm: it supplies the WASM runtime and implements imported host functions. (See the glossary — “host” deliberately does not mean the Rust code inside the module.)
| Host language | Status | Runtime | Host adapter support |
|---|---|---|---|
| Java | Primary host | Chicory | Stock runtime API, HostBridge, generated Java adapters |
| Python | Supported host package | Wasmtime (wasmtime-py) | boomslang-py wheel with the Sandbox API and host functions |
| Rust | Supported example host | Wasmtime | Generated Rust adapters; see examples/rust-host/ |
| Other languages | ABI target | Any WASM runtime with compatible imports | Implement the ABI JSON contract directly |
Where to go next
- Quickstart — run Python from Java in five minutes
- Installation & Runtime Variants — Maven coordinates and the
no-python-runtimeclassifier - Glossary — host vs. guest, and what the directory names actually mean
Quickstart
Run Python from Java in about five minutes. You need Java 21+ and Maven (or Gradle); nothing else — no Python install, no native libraries, no containers.
(Embedding from Python instead? See the Python host — the same runtime as a pip-installable wheel.)
1. Add the dependency
Boomslang is published to Maven Central:
<dependency>
<groupId>com.hubspot</groupId>
<artifactId>boomslang</artifactId>
<version>0.1.1</version>
</dependency>
Check Maven Central for the latest version. The default artifact is large (~100 MB) because it bundles the entire Python runtime — CPython, the stdlib, NumPy, Pandas, and friends — plus ahead-of-time compiled classes. If that’s a problem, see Installation & Runtime Variants.
2. Run some Python
import com.hubspot.boomslang.HostBridge;
import com.hubspot.boomslang.PythonExecutorFactory;
import com.hubspot.boomslang.PythonInstance;
import com.hubspot.boomslang.PythonResult;
import java.nio.file.Files;
import java.nio.file.Path;
public class Main {
public static void main(String[] args) throws Exception {
Path pythonRoot = Files.createTempDirectory("boomslang-python");
PythonExecutorFactory factory = PythonExecutorFactory
.builder()
.withStdlibPath(pythonRoot)
.addExtension(HostBridge.builder().buildExtension())
.build();
PythonResult result = factory.runOnWasmThread(() -> {
PythonInstance instance = factory.createInstance(pythonRoot);
return instance.execute("print('hello from Python')");
});
System.out.println(result.stdout()); // hello from Python
}
}
What each piece does:
withStdlibPath— a host directory where boomslang extracts the packaged Python resources. The instance root passed tocreateInstanceis what Python sees as/.addExtension(HostBridge.builder().buildExtension())— registers the host functions the bundled runtime imports. This line is required: the bundledboomslang.wasmunconditionally importsboomslang.callandboomslang.log, and instantiation fails without an extension that provides them. A bareHostBridgeregisters a no-op log handler; wire up real handlers when you want Python to call back into Java (see the user guide).runOnWasmThread— runs the WASM call on a dedicated thread with a larger JVM stack, and is where you set timeouts.
The first factory build extracts the Python resources to stdlibPath and takes a few seconds; creating instances afterwards is fast (copy-on-write snapshot of a pre-initialized interpreter).
3. Use the batteries
The bundled runtime includes NumPy, Pandas, Matplotlib, Pillow, Pydantic, ijson, and Jinja2:
PythonResult result = factory.runOnWasmThread(() -> {
PythonInstance instance = factory.createInstance(pythonRoot);
return instance.execute("import numpy as np; print(np.arange(5).sum())");
});
System.out.println(result.stdout()); // 10
Next steps
- Reuse one factory for your whole application; it holds the memory snapshot.
- Set timeouts on
runOnWasmThreadand handle poisoned instances — see the user guide on lifecycle. - Let Python call your Java code with
HostBridge.builder().withFunction(...). - Trim the dependency with the
no-python-runtimeclassifier if you ship your own runtime resources.
Installation & Runtime Variants
Boomslang is published to Maven Central as com.hubspot:boomslang. Two variants of the artifact exist, distinguished by classifier.
Default artifact
<dependency>
<groupId>com.hubspot</groupId>
<artifactId>boomslang</artifactId>
<version>0.1.1</version>
</dependency>
The default jar includes everything needed to run Python:
- the Java API
- the bundled
boomslang.wasm(CPython 3.14 forwasm32-wasip1) - Python resources: the stdlib plus NumPy, Pandas, Matplotlib, Pillow, Pydantic, ijson, and Jinja2
- generated Chicory AOT classes (
com.hubspot.boomslang.compiled.*), so the runtime executes as JVM bytecode
The tradeoff is size: the jar is roughly 100 MB. For most applications that’s a fine price for a zero-setup Python runtime; if it isn’t, use the classifier below.
no-python-runtime classifier
Use this when your application — or another artifact in your dependency tree — provides the Python runtime:
<dependency>
<groupId>com.hubspot</groupId>
<artifactId>boomslang</artifactId>
<version>0.1.1</version>
<classifier>no-python-runtime</classifier>
</dependency>
This classifier excludes python/** and com/hubspot/boomslang/compiled/**; the Java API stays in the artifact. Your application then needs to provide:
- a WASM binary, usually at the classpath location
python/bin/boomslang.wasm - Python resources under
python/usr/local/lib/python3.14 - an AOT machine factory if you want AOT instead of interpreter fallback
If your WASM is not at the default classpath location, point the factory at it with withWasmResource(...).
This is the variant to use with a custom Python build — a runtime recompiled with your own typed extensions or extra native libraries.
Python wheel
The Python host package is distributed as a wheel attached to GitHub releases (not PyPI):
pip install https://github.com/HubSpot/boomslang/releases/download/<tag>/boomslang-<version>-py3-none-any.whl
Runtime assets outside Maven
Every release also publishes raw runtime assets to GitHub Releases: the boomslang.wasm binary, a boomslang-runtime-*.tar.gz with the Python resource tree, and sha256 checksums. Per-commit prerelease builds from main are published as build-<sha> releases. These are what non-Java hosts (or no-python-runtime consumers who package resources themselves) consume.
Requirements
- Java 21 or newer.
- No system Python, no native libraries, no containers — the runtime is entirely inside the jar.
Running Python from Java
Create one PythonExecutorFactory and reuse it for the life of your application — it holds the pre-initialized interpreter snapshot. Create a PythonInstance per execution context; instances are cheap (a copy-on-write view of the snapshot).
Path pythonRoot = Files.createTempDirectory("boomslang-python");
PythonExecutorFactory factory = PythonExecutorFactory
.builder()
.withStdlibPath(pythonRoot)
.addExtension(HostBridge.builder().buildExtension())
.build();
PythonResult result = factory.runOnWasmThread(() -> {
PythonInstance instance = factory.createInstance(pythonRoot);
return instance.execute("print('hello from Python')");
});
System.out.println(result.stdout());
withStdlibPathis a host directory where boomslang extracts the packaged Python resources. The instance root passed tocreateInstanceis what Python sees as/.addExtension(HostBridge...)is required with the bundled runtime — its WASM unconditionally imports theboomslang.call/boomslang.loghost functions. See Calling host functions.runOnWasmThreadruns the work on a dedicated WASM thread with a larger JVM stack. Always run Python work through it; see Lifecycle, timeouts & limits for the threading model and timeout semantics.
Results and errors
PythonResult carries stdout(), stderr(), exitCode(), and executionTimeMs(). A Python exception does not throw on the Java side — it produces a result with a non-zero exit code and the traceback in stderr:
PythonResult result = factory.runOnWasmThread(() ->
factory.createInstance(pythonRoot).execute("1 / 0")
);
// result.exitCode() != 0; result.stderr() contains the ZeroDivisionError traceback
Check exitCode() when an execution may fail. Java exceptions are reserved for harder failures: PythonCompilationException from compile(...) on a syntax error, and PythonExecutionException when the WASM runtime itself traps (both include the captured stderr in their message).
Reusing compiled code
Use compile and loadCode when the same source runs many times. Compilation happens once; each run replays the bytecode:
PythonInstance instance = factory.createInstance(pythonRoot);
byte[] bytecode = instance.compile(sourceCode);
PythonResult first = instance.loadCode(bytecode);
instance.reset();
PythonResult second = instance.loadCode(bytecode);
The bytecode is CPython marshal data and is specific to the runtime build that produced it: cache it within a process, but don’t persist it across boomslang version upgrades.
Passing input
Feed data to Python via stdin with setStdin(...) on the instance, or write files into the instance root directory before executing — Python sees that directory as its filesystem.
Lifecycle, Timeouts & Limits
The threading model
All WASM execution must go through factory.runOnWasmThread(...). The factory maintains a dedicated thread pool (currently fixed at 10 threads) whose threads have an enlarged JVM stack — CPython’s C stack lives on the JVM stack under Chicory, and deep Python recursion needs the headroom.
PythonInstance is not thread-safe. Use one instance from one task at a time; create separate instances for concurrent executions (they’re cheap — each is a copy-on-write view of the shared snapshot).
Timeouts
The overload runOnWasmThread(task, timeout, instance) enforces a wall-clock timeout:
PythonInstance instance = factory.createInstance(pythonRoot);
PythonResult result = factory.runOnWasmThread(
() -> instance.execute("print(sum(range(10)))"),
Duration.ofSeconds(5),
instance
);
On timeout the future is cancelled, the instance is poisoned, and a TimeoutException is thrown to the caller.
Honest caveat — timeouts do not hard-stop Python. Cancellation is delivered as a Java thread interrupt, and the interrupt is only observed when the guest calls back into a host function. CPU-bound Python that never calls a host function (a tight pure-computation loop) keeps running on its pool thread past the timeout. Since the pool is fixed-size, enough runaway executions can exhaust it. Hard-stop interruption is tracked in issue #42. Until it lands: treat timeouts as a cooperative mechanism, prefer scripts that do I/O through host functions, and consider process-level isolation if you execute fully untrusted CPU-bound code.
Poisoned instances
A poisoned instance refuses further work. Because the timed-out execution may still be running (see above), the safest response is to discard the instance and create a new one. reset() restores the instance memory to the golden snapshot and clears the poison flag, but resetting while the abandoned execution is still on the WASM thread races with it — only reset() when you know the prior call actually finished.
if (instance.isPoisoned()) {
instance = factory.createInstance(pythonRoot); // preferred over reset()
}
reset() is also useful in the happy path: it returns a healthy instance to the pristine snapshot state between executions (fresh __main__, no leaked globals) much faster than re-importing anything.
Resource limits
createInstance(rootPath, limits) accepts a ResourceLimits:
ResourceLimits limits = ResourceLimits
.builder()
.maximumOutputBytes(1024 * 1024) // cap captured stdout/stderr (default 10 MB)
.maximumMemoryPages(4096) // cap guest memory growth (64 KiB pages)
.build();
PythonInstance instance = factory.createInstance(pythonRoot, limits);
Caveat:
ResourceLimits.executionTimeoutexists on the record but is not currently enforced — the only enforced timeout is the one you pass torunOnWasmThread(also tracked in issue #42).
Cleanup
- The factory pins the golden memory snapshot (hundreds of MB with the bundled runtime) for its lifetime — another reason to create exactly one.
PythonInstance.close()marks the instance unusable; instance memory is reclaimed by GC once unreferenced.
Calling Host Functions from Python
The bundled runtime exposes two host functions to Python through the boomslang_host module: call(name, args) and log(level, message). The Java side decides what they do.
PythonExecutorFactory factory = PythonExecutorFactory
.builder()
.withStdlibPath(pythonRoot)
.addExtension(
HostBridge
.builder()
.withFunction("lookup_user", userId -> userService.findById(userId).toJson())
.withLogHandler((level, message) -> LOG.info("[Python] {}", message))
.buildExtension()
)
.build();
from boomslang_host import call, log
user_json = call("lookup_user", "12345")
log(2, "loaded user")
Handler options
HostBridge.Builder gives you three levels of control:
withFunction(name, fn)— register namedString -> Stringhandlers; unknown names raise in Python.withCallHandler((name, args) -> ...)— one handler that receives everycall(name, args); use it for dynamic dispatch. Mutually exclusive withwithFunctionregistrations (awithCallHandlertakes precedence).withLogHandler((level, message) -> ...)— receiveslog(...)calls; the default is a no-op.
Values cross the boundary as strings. The common pattern is JSON in, JSON out — serialize on whichever side is more convenient.
If you register no handlers at all (HostBridge.builder().buildExtension(), as in the quickstart), logging is a no-op and any call(...) raises a RuntimeError in Python. The extension still must be registered: the bundled WASM unconditionally imports boomslang.call and boomslang.log, and the factory fails to instantiate it without them.
Errors and interruption
- An exception thrown by a Java handler surfaces as a Python exception at the
call(...)site. - Host-function entry is also where thread interruption (from timeouts) is observed — handlers should not swallow
InterruptedException.
Beyond strings: typed extensions
boomslang_host.call is deliberately blunt: one stringly-typed entry point. When you want dedicated Python functions with typed signatures (def lookup(request: str, shard: int) -> str) and no JSON overhead, define a custom extension with the boomslang-hostgen DSL and build a custom Python runtime. For Java CompletionStage work awaited from Python asyncio, see Async host calls.
Async Host Calls
Python code can await asynchronous Java work. The Java side returns a CompletionStage<String>; the Python side awaits it with standard asyncio APIs; the AsyncHostRegistry brokers completions between the two.
This works with the bundled runtime’s string bridge, and — more ergonomically — with typed async functions in a custom extension.
Java setup
Share one AsyncHostRegistry between the HostBridge and any async extensions:
AsyncHostRegistry asyncRegistry = new AsyncHostRegistry();
var hostBridge = HostBridge
.builder()
.withAsyncRegistry(asyncRegistry)
.withAsyncFunction("lookup", payload -> rpcClient.lookupAsync(payload))
.buildExtension();
PythonExecutorFactory factory = PythonExecutorFactory
.builder()
.withStdlibPath(pythonRoot)
.addExtension(hostBridge)
.build();
The handler returns a CompletionStage<String>. Java completion threads only enqueue results into the registry — Python is resumed by the Boomslang event loop polling for completions on the WASM thread, so no Java thread ever touches guest memory concurrently.
Python side
Install the Boomslang event loop, then use normal asyncio:
import asyncio
from boomslang_host.asyncio import install, async_call
install()
async def main():
first = async_call("lookup", '{"id": 1}')
second = async_call("lookup", '{"id": 2}')
results = await asyncio.gather(first, second)
print(results)
asyncio.run(main())
Concurrency comes from overlapping the Java-side work: both lookups run in Java simultaneously while Python awaits.
Typed async functions in custom extensions
A custom Python build can declare async functions with typed parameters in the hostgen DSL (f.r#async().param(...).returns(Type::String)). Python then imports them as real module functions and awaits them directly:
import asyncio
from boomslang_host.asyncio import install
from my_async_ext import lookup
install()
async def main():
results = await asyncio.gather(lookup('{"id": 1}', 0), lookup('{"id": 2}', 1))
print(results)
asyncio.run(main())
On the Java side the generated builder takes typed handlers returning CompletionStage<String>, plus the shared registry via withAsyncRegistry(asyncRegistry).
Async functions currently support typed parameters with a string return.
Failure semantics
- A handler that throws synchronously, or a stage that completes exceptionally, surfaces as a
HostAsyncErrorraised from the awaiting coroutine — failures never hang the event loop. - Cancelling the Python task cancels the in-flight Java future.
Under the hood
The two sides speak a small, versioned wire protocol (__async_protocol__, __async_start__, __async_poll__, __async_result__, __async_cancel__) over the stock call bridge. The __async_* names are a reserved control namespace — don’t define extension functions with those names. The full protocol is specified in the reference section (async wire protocol).
Adding Python Modules & the Overlay
Three mechanisms get extra Python code into the runtime, in increasing order of weight.
In-memory modules
Install small pure-Python packages when you create the factory:
PythonExecutorFactory factory = PythonExecutorFactory
.builder()
.withStdlibPath(pythonRoot)
.withModule("my_package", "helpers", "def double(x): return x * 2")
.build();
from my_package.helpers import double
withLibrary(name, modules) does the same for a map of module name → source. Use this for small helper code generated or selected at runtime.
The resource overlay (repo contributors)
Inside this repo, most packaged runtime files under core/src/main/resources/python/ are generated by the WASM/CPython pipeline and git-ignored. Small source-controlled Python additions to the stock runtime live under core/src/main/resources/python-overlay/, which mirrors the guest filesystem layout:
core/src/main/resources/python-overlay/usr/local/lib/python3.14/boomslang_host/asyncio.py
→ <stdlibPath>/usr/local/lib/python3.14/boomslang_host/asyncio.py
During factory creation, boomslang extracts the generated python/ resources first, then copies the overlay on top.
Snapshot precedence: modules that are prewarmed into the Wizer snapshot are served from the frozen
sys.modulesat runtime — for those, editing the overlay file has no effect until the WASM is rebuilt. Use the overlay only for modules that are imported at execution time.
The resource pipeline
Larger third-party packages — anything with native code, or big enough that in-memory installation is impractical — belong in the WASM/Python build pipeline, where they are baked into the runtime resources (and optionally prewarmed). See Custom Python builds.
Supported Python Libraries
The bundled runtime ships the CPython 3.14 standard library plus these third-party packages, including their native extensions compiled to WASI and statically linked:
import ijson
import jinja2
import matplotlib
import numpy as np
import pandas as pd
from PIL import Image
from pydantic import BaseModel
Notes:
- Matplotlib renders to in-memory buffers / files (e.g.
savefigto the instance filesystem); there is no display backend. - Pillow supports reading and writing common formats (PNG round-trips are covered by integration tests).
- Packages with native code cannot be
pip installed into the runtime — WASI has no dynamic linking, so native extensions must be statically linked at build time. To add one, extend the build pipeline. - Pure-Python packages can be added without rebuilding anything via in-memory modules or the resource pipeline.
Custom Python Builds
Build a custom Python/WASM runtime when the stock boomslang_host.call(...) bridge is too blunt. A custom build changes the Rust/WASI guest inside boomslang.wasm; it is independent of whether the outside host is Java, Rust, or another language.
Use one for:
- typed WASM imports instead of string/JSON calls
- host functions exposed as custom Python modules
- extra Python modules prewarmed into the Wizer snapshot
- native libraries required by Python extensions (WASI has no dynamic linking — native code must be statically linked into the guest)
The runnable starting point is examples/custom-python-build/.
The build flow
- Declare an extension contract in your extension crate’s
build.rswith theboomslang-hostgenDSL. boomslang-hostgenemits the Rust guest code and an ABI JSON file at build time.- Generate host-language bridge code from the ABI JSON (Java or Rust adapters).
- Compose the extension with
boomslang-host-corein a custom guest crate. - Add any required native static libraries to the WASI build.
- Build the guest to
wasm32-wasip1. - Package the custom
boomslang.wasmand matching Python resources with your app. - For Java packaging, depend on
com.hubspot:boomslangwith theno-python-runtimeclassifier.
Declaring an extension
// my-ext/build.rs
fn main() {
let ext = boomslang_hostgen::ExtensionSpec::new("myext")
.wasm_module("myext")
.prewarm(["_myext"])
.function("do_thing", |f| {
f.param("input", boomslang_hostgen::Type::String)
.returns(boomslang_hostgen::Type::String)
});
boomslang_hostgen::Build::new(ext)
.emit()
.generate()
.expect("generate myext");
println!("cargo:rerun-if-changed=build.rs");
}
#![allow(unused)]
fn main() {
// my-ext/src/lib.rs
include!(concat!(env!("OUT_DIR"), "/ext_myext.rs"));
}
Register it alongside the stock host bridge in your guest crate:
#![allow(unused)]
fn main() {
boomslang_host_core::init(
|| {
boomslang_ext_host_bridge::register();
my_extension::register();
},
|py| {
boomslang_ext_host_bridge::prewarm(py);
my_extension::prewarm(py);
},
);
}
And build:
export CPYTHON_WASI_DIR=../../cpython/build/cpython-wasi # or omit to download from GitHub Releases
cargo build --target wasm32-wasip1 --release
Generating host adapters
Run the hostgen CLI against the ABI JSON your build emitted:
# Java hosts
boomslang-hostgen myext.abi.json --java-out src/main/java --java-package com.example.generated
# Rust/Wasmtime hosts
boomslang-hostgen myext.abi.json --rust-host-out src/generated
The generated Java class exposes typed functional interfaces and a builder — you only fill in implementations:
var extension = MyextHostFunctions.builder()
.withDoThing(input -> "result from Java")
.buildExtension();
PythonExecutorFactory factory = PythonExecutorFactory
.builder()
.withStdlibPath(pythonRoot)
.addExtension(extension)
.build();
Python code then imports the extension as a real module:
from myext import do_thing
print(do_thing("hello"))
Async functions (f.r#async() in the DSL) generate CompletionStage<String> handlers on the Java side; see Async host calls.
The full DSL surface and the ABI JSON contract are documented in the reference section.
Python Host (boomslang-py)
Run sandboxed Python from Python. The boomslang-py/ package bundles the same WASM runtime as the Java artifact — CPython 3.14 with NumPy, Pandas, Matplotlib, Pillow, Pydantic, and ijson — and executes it with wasmtime. Guest code has no network access and can only touch the directories you mount.
Wheels are published as GitHub release assets (not PyPI):
pip install https://github.com/HubSpot/boomslang/releases/download/<tag>/boomslang-<version>-py3-none-any.whl
From a source checkout: just fetch-main-wasm && just python-stage, then pip install -e boomslang-py.
Quickstart
from boomslang import Sandbox
with Sandbox() as sandbox:
result = sandbox.execute("print('hello from the sandbox')")
print(result.stdout) # hello from the sandbox
print(result.exit_code) # 0
The semantics mirror the Java host: interpreter state persists across execute() calls on the same sandbox; reset() restores the pristine interpreter image; guest Python errors don’t raise on the host — they surface as exit_code != 0 with the traceback in result.stderr.
Resource limits and timeouts
from boomslang import ResourceLimits, Sandbox
sandbox = Sandbox(limits=ResourceLimits(
timeout=10.0, # seconds, default 120
max_memory_bytes=512 * 1024 * 1024, # default: wasm32 4 GiB cap
max_output_bytes=1024 * 1024, # per stream, default 10 MiB
))
A timeout raises PythonTimeoutError and poisons the sandbox; call reset() to revive it. max_memory_bytes must exceed the baseline runtime image (~150 MB) or instantiation fails. Note that unlike the Java host, wasmtime gives the Python host real interruption — timed-out guest code is actually stopped.
Filesystem
The guest sees a fixed layout: /usr (bundled runtime, read-only), /lib (lib_dir=, on the guest sys.path), /work (work_dir=), and /tmp (managed per-sandbox). work_dir and lib_dir default to managed temp dirs exposed as sandbox.work_dir / sandbox.lib_dir. Arbitrary extra mounts are not supported — share files through /work, and drop extra pure-Python libraries into lib_dir to make them importable.
(The mount table is baked into the runtime image by Wizer; the sandbox probes the image’s layout once per process and either mounts your directories directly or transparently syncs files around each execution — visible-before/appears-after semantics are the same either way.)
Host functions
Guest code calls back into your process through the same boomslang_host bridge the Java host uses; arguments and results cross as JSON. Results larger than the bridge’s 1 MiB buffer are fetched back in chunks transparently.
sandbox = Sandbox()
@sandbox.host_function("lookup_user")
def lookup_user(args):
return {"id": args["id"], "name": "Ada"}
result = sandbox.execute("""
import json
from boomslang_host import call
user = json.loads(call("lookup_user", json.dumps({"id": 7})))
print(user["name"])
""")
call_handler= gives raw (name, args_json) -> json control; on_log= receives boomslang_host.log() output (default: the boomslang.guest logger).
Async: @sandbox.async_host_function("fetch") handlers run on a host thread pool, and guest coroutines await them via boomslang_host.asyncio — the same wire protocol as the Java AsyncHostRegistry, so the guide’s async patterns apply unchanged.
Bytecode, functions, stdin
sandbox.compile() / load_bytecode() / execute_function(name, json_args) mirror the Java compile/loadCode/executeFunction flow, and sandbox.set_stdin(...) feeds the next execution’s input() (consumed then cleared, like the Java host).
Performance notes
- The first
Sandbox()on a machine compiles the ~100 MB module (up to minutes); wasmtime caches the compiled module on disk, so later processes start in under a second. - Each sandbox materializes its own copy of the runtime memory (hundreds of MB) — there is no copy-on-write sharing like the Java host’s
CopyOnWriteMemory. Reuse sandboxes withreset()where isolation allows. pip install --no-compileskips byte-compiling the bundled stdlib tree, which the guest never reads.
The package README (boomslang-py/README.md) ships with the wheel and is the canonical package-level reference.
Embedding from Rust (and Other Hosts)
The extension ABI is not tied to Java. An extension declares its contract in build.rs with the boomslang-hostgen DSL and emits ABI JSON; host-language adapters are generated from that JSON. Any WASM runtime that implements the same imports can run boomslang.wasm.
| Host language | Status | Runtime | Host adapter support |
|---|---|---|---|
| Java | Primary host | Chicory | Stock runtime API, HostBridge, generated Java adapters |
| Python | Supported host package | Wasmtime (wasmtime-py) | boomslang-py wheel with the Sandbox API |
| Rust | Supported example host | Wasmtime | Generated Rust adapters; see below |
| Other languages | ABI target | Any WASM runtime with compatible imports | Implement the ABI JSON contract directly |
The Rust example host
examples/rust-host/ is a runnable Wasmtime embedder. Its build.rs turns an <extension>.abi.json into typed Wasmtime bindings:
cargo run --manifest-path examples/rust-host/Cargo.toml
The generated binding is a typed builder plus a register(&mut wasmtime::Linker<_>):
#![allow(unused)]
fn main() {
let host = BoomslangHostHostFunctions::builder()
.with_call(|name, payload| Ok(format!("{name}: {payload}")))
.with_log(|level, message| {
eprintln!("[guest log:{level}] {message}");
Ok(())
})
.build();
host.register(&mut linker)?;
}
Generated Rust host bindings also include an AsyncHostRegistry mirroring the Java one: typed async imports return registry tokens, and the stock call handler routes the reserved __async_* control calls through the same registry.
Embedding a full boomslang.wasm
The generated register covers the extension imports. A complete embedder additionally:
- adds WASI preview1 imports to the same
Linker - instantiates the module
- drives execution through boomslang’s exported functions (
alloc,compile_source,execute, and the output-buffer protocol)
The exported function contract is specified in the reference section (base ABI). Runtime assets (boomslang.wasm + the Python resource tree) are published on GitHub Releases — see Installation.
Base ABI Specification
This page specifies the contract between a host and boomslang.wasm: the functions the guest exports and the conventions for calling them. It is the contract PythonInstance implements on the Java side, and what a non-Java embedder must implement directly. (Host functions the guest imports are covered by the extension ABI.)
Source of truth: python-host-core/src/export.rs (guest) and core/src/main/java/com/hubspot/boomslang/PythonInstance.java (Java host).
There is currently no ABI version export; compatibility between a host and a wasm artifact is by construction (build them from the same commit). A version handshake is tracked in issue #43.
Conventions
- The guest exports a single linear memory. All pointers are
i32offsets into it. - The host owns buffer lifecycles. Allocate guest buffers with
alloc, write through the exported memory, pass(ptr, len)pairs, and free withdeallocafter the call. The guest never frees host-allocated buffers, and the guest’s internal allocations are not the host’s concern. - All strings are UTF-8. Passing invalid UTF-8 where a string is expected returns
-1. - Every execution-family export (
compile_source,load_bytecode,execute,execute_function,install_module,uninstall_module) clears the captured stdout/stderr buffers on entry. Read outputs after each call, before the next one. - Error reporting is two-channel: a coarse return code, plus the Python traceback captured in the stderr buffer. Detailed error strings only exist in stderr.
Exports
| Export | Signature | Semantics |
|---|---|---|
alloc | (size: i32) -> i32 | Allocate size bytes in guest memory (mimalloc); returns pointer. |
dealloc | (ptr: i32, size: i32) | Free an alloc’d buffer. size is currently ignored but pass the allocated size. |
compile_source | (source_ptr: i32, source_len: i32, output_ptr: i32, output_max_len: i32) -> i32 | Compile Python source to marshal bytecode, written to the caller-provided output buffer. Returns the bytecode length, -1 on invalid UTF-8 or compile error (traceback in stderr), -3 if the bytecode exceeds output_max_len. |
load_bytecode | (ptr: i32, len: i32) -> i32 | Unmarshal and execute bytecode from compile_source. 0 ok; 1 Python error (traceback in stderr). |
execute | (script_ptr: i32, script_len: i32) -> i32 | Execute Python source in __main__. 0 ok; 1 Python error; -1 invalid UTF-8. |
execute_function | (name_ptr: i32, name_len: i32, args_ptr: i32, args_len: i32) -> i32 | Call a named function from previously loaded code with one string argument (args_len 0 → empty string). 0 / 1 / -1 as above. |
get_stdout_len / get_stderr_len | () -> i32 | Byte length of the captured stream. |
get_stdout / get_stderr | (ptr: i32, max_len: i32) -> i32 | Copy up to max_len bytes of the captured stream into the caller’s buffer; returns bytes written. |
install_module | (name_ptr: i32, name_len: i32, source_ptr: i32, source_len: i32) -> i32 | Install a pure-Python module under name (dotted names allowed). 0 / 1 / -1. |
uninstall_module | (name_ptr: i32, name_len: i32) -> i32 | Remove an installed module. 0 / 1 / -1. |
reset_state | () | Clear capture buffers and reset the __main__ namespace. Note: the Java host does not call this — it resets by restoring the copy-on-write memory snapshot, which is stricter. |
get_heap_pages | () -> i32 | Current guest memory size in 64 KiB pages. Used by hosts to size snapshots. |
Imports
A complete embedder must provide, on the same linker/instance:
- WASI preview1 — filesystem, clock, random, stdio.
- Extension imports — the bundled runtime imports
boomslang.callandboomslang.log(extension ABI); custom builds import whatever their extensions declare.
Instantiation fails on any missing import.
Call sequences
Execute a script and read output (what PythonInstance.execute does):
ptr = alloc(len(script)) # write script bytes at ptr
rc = execute(ptr, len(script)) # 0 ok, 1 python error, -1 bad utf-8
dealloc(ptr, len(script))
n = get_stdout_len()
buf = alloc(n); get_stdout(buf, n) # read n bytes from memory at buf
dealloc(buf, n) # same dance for stderr
Compile once, run many (compile / loadCode):
out = alloc(MAX) # Java uses MAX = 10 MiB
n = compile_source(src, len, out, MAX) # n = bytecode length, or -1 / -3
bytecode = memory[out .. out+n]; dealloc(out, MAX)
...
ptr = alloc(len(bytecode)) # later, possibly many times
rc = load_bytecode(ptr, len(bytecode)) # 0 / 1
The bytecode is CPython marshal data — valid only for the exact runtime build that produced it.
Known sharp edges
-1is overloaded: it means both “invalid UTF-8 input” and “Python-level failure” forcompile_source. Disambiguate via stderr.- There is no structured error channel; hosts surface failures by pairing the return code with the captured stderr.
- Output larger than the host’s configured cap (Java default 10 MB) is rejected host-side, not guest-side.
Extension ABI JSON & Lowering
An extension declares its host functions once, in build.rs, with the hostgen DSL. The build emits an ABI JSON file — the language-neutral contract from which host adapters (Java, Rust, or hand-written for any runtime) are generated.
Schema
{
"abi_version": 1,
"extension": {
"name": "boomslang_host",
"wasm_module": "boomslang",
"prewarm": ["_boomslang_host", "boomslang_host", "boomslang_host.asyncio"]
},
"functions": [
{
"name": "call",
"params": [
{ "name": "name", "type": "string" },
{ "name": "args", "type": "string" }
],
"returns": "string",
"async": false
},
{
"name": "log",
"params": [
{ "name": "level", "type": "int" },
{ "name": "message", "type": "string" }
],
"returns": null,
"async": false
}
]
}
| Field | Meaning |
|---|---|
abi_version | Schema version. Generators require an exact match (currently 1) and fail with a clear error otherwise. If omitted, defaults to 1. |
extension.name | Extension identifier. Drives generated names: Python module <name>, guest file ext_<name>.rs, Java class <Name>HostFunctions, Rust host file host_<name>.rs. |
extension.wasm_module | The WASM import module the functions live under (e.g. import boomslang.call). Defaults to the extension name when omitted. |
extension.prewarm | Python modules imported during Wizer initialization, frozen into the golden snapshot. |
functions[].name | Function name; becomes the import name and the Python-visible function. |
functions[].params | Ordered typed parameters. |
functions[].returns | Return type or null for none. Async functions must return string. |
functions[].async | Whether the function is an async host call (see below). |
Types are a closed enum: string, int, float, bytes. Unknown type values fail parsing.
Lowering to WASM signatures
The ABI JSON decides the import signatures and memory protocol. For a function with declared params and return:
| Declared | Lowered |
|---|---|
string / bytes param | i32 ptr, i32 len (UTF-8 bytes for strings) |
int param | i32 |
float param | f64 |
string / bytes return | caller appends i32 result_ptr, i32 result_max_len; host writes the value into that buffer and returns the written byte length as i32 |
| no return | i32 status return |
| async function | returns an i64 host token instead of a value (see the async wire protocol) |
So declared call(name: string, args: string) -> string becomes the import:
boomslang.call(name_ptr: i32, name_len: i32,
args_ptr: i32, args_len: i32,
result_ptr: i32, result_max_len: i32) -> i32
Result buffer protocol: the guest allocates the result buffer (currently capped at 1 MiB per call) and passes it to the host. A negative return signals failure: -1 for a handler error, -2 when the value did not fit in result_max_len. The guest surfaces any negative return as a Python exception.
Behavioral note: on malformed pointers the generated Java host traps the instance, while the generated Rust host returns -1; aligning these is tracked in issue #44.
Generated artifacts
From one declaration, hostgen produces:
- Rust guest (
ext_<name>.rs, included viainclude!into your extension crate): theexternimports, a Python module exposing typed functions, andregister()/prewarm()hooks forboomslang_host_core::init. - Java host adapter (
<Name>HostFunctions.java): typed functional interfaces + a builder producing aBoomslangExtensionforPythonExecutorFactory.addExtension. - Rust host adapter (
host_<name>.rs): a typed builder withregister(&mut wasmtime::Linker<_>).
Function names prefixed __async_ are reserved for the async control namespace and rejected by validation.
Async Wire Protocol (v1)
boomslang_host.asyncio (the Python client) and the host-side AsyncHostRegistry talk over a small, versioned protocol invoked through the stock boomslang_host.call(name, args) bridge. This page is the wire-level specification; usage is in the async guide.
The __async_* names are a reserved control namespace — extension host functions may not use them (hostgen validation rejects them).
| Control call | Args | Returns |
|---|---|---|
__async_protocol__ | — | integer protocol version (currently 1) |
__async_start__ | name\npayload | decimal token for a registered named async handler |
__async_poll__ | timeout ms (<0 blocks, 0 polls) | one header line per ready completion: token\t{1|0}\t<valueByteLength> |
__async_result__ | token | base64 of that completion’s value bytes (consumes it) |
__async_cancel__ | token | cancels the in-flight future |
Typed async extension functions bypass __async_start__: their WASM import returns the i64 token directly from the shared registry. Polling, result retrieval, and cancellation still flow through the control calls above.
Design rationale
- Versioned. The Python client is frozen into each consumer’s WASM Wizer snapshot, so the host must stay compatible with already-shipped clients.
__async_protocol__lets a client refuse a host older than the protocol it was built against; bumpAsyncHostRegistry.PROTOCOL_VERSIONonly for breaking wire changes. - Poll and result are decoupled.
__async_poll__returns only headers (token, ok flag, length); values are fetched one at a time via__async_result__. A batch of completions therefore never exceeds the single host-call result buffer. (A single value larger than that buffer is still a limitation — chunked retrieval is a future protocol addition.) - Failures never hang. Synchronous handler errors are recorded via
AsyncHostRegistry.startFailedand surface as a failed completion (the coroutine raisesHostAsyncError); the client also rejects any non-positive token immediately. - Binary-safe value channel. Completion values are carried as base64 of raw bytes, so extending async returns to
byteslater needs no wire change.
Implementations
The protocol is implemented by the Java AsyncHostRegistry (core/), the generated Rust host registry (hostgen’s rust_host.rs template), and the Python client (boomslang_host/asyncio.py). They must agree byte-for-byte; consolidation of the duplicated implementations is tracked in issue #45.
hostgen DSL Reference
boomslang-hostgen is both a Rust library (used from an extension crate’s build.rs) and a CLI. The library declares an extension and emits generated code + ABI JSON; the CLI consumes ABI JSON and generates host adapters.
Declaring an extension (build.rs)
use boomslang_hostgen::{Build, ExtensionSpec, Type};
fn main() {
let ext = ExtensionSpec::new("myext")
.wasm_module("myext")
.prewarm(["_myext"])
.function("do_thing", |f| {
f.param("input", Type::String).returns(Type::String)
})
.function("lookup", |f| {
f.r#async()
.param("request", Type::String)
.param("shard", Type::Int)
.returns(Type::String)
});
Build::new(ext).emit().generate().expect("generate myext");
println!("cargo:rerun-if-changed=build.rs");
}
ExtensionSpec
| Method | Effect |
|---|---|
ExtensionSpec::new(name) | Start a spec; name is the extension/Python module name. |
.wasm_module(module) | WASM import module for the functions (defaults to the extension name). |
.prewarm([modules]) | Python modules to import during Wizer init (frozen into the snapshot). |
.function(name, |f| ...) | Declare a host function via the closure. |
FunctionSpec (inside the closure)
| Method | Effect |
|---|---|
.param(name, Type) | Append a typed parameter (order matters). |
.returns(Type) | Declare the return type (omit for none). |
.r#async() | Mark as an async host call — Python awaits it; the host handler is asynchronous. Async functions must return Type::String. |
Type is String, Int, Float, or Bytes. See lowering rules for the WASM signatures these produce.
Build
| Method | Output |
|---|---|
Build::new(spec) | Start from a spec. |
.emit() | Shorthand for .emit_rust_guest().emit_abi_json() — the standard build.rs setup. |
.emit_rust_guest() | $OUT_DIR/ext_<name>.rs — guest code, consumed by include!(concat!(env!("OUT_DIR"), "/ext_<name>.rs")). |
.emit_abi_json() | $OUT_DIR/<name>.abi.json. |
.emit_abi_json_to(path) | ABI JSON at a stable path of your choosing (recommended when other builds consume it — $OUT_DIR paths contain build fingerprints). |
.emit_java_host(out_dir, package) | <Name>HostFunctions.java under out_dir/<package path>/. Prefer running the CLI after the build instead of writing into a source tree from build.rs. |
.emit_rust_host(out_dir) | host_<name>.rs Wasmtime adapter. |
.generate() | Validate the manifest and write everything requested. |
Validation enforces: exact abi_version match, identifier-safe names (no Java/Rust keywords), no reserved __async_* function names, and string returns for async functions.
The CLI
boomslang-hostgen <abi.json> [--java-out DIR [--java-package PKG]] [--rust-host-out DIR]
| Flag | Effect |
|---|---|
--java-out DIR | Generate the Java host adapter into DIR (package subdirectories created). |
--java-package PKG | Java package for generated code (default com.hubspot.boomslang.extensions). |
--rust-host-out DIR | Generate the Rust Wasmtime host adapter into DIR. |
With no output flag the CLI validates the ABI JSON, then exits nonzero with no output requested.
From source: cargo run --manifest-path boomslang-hostgen/Cargo.toml -- <args>.
Library entry points
For build tooling that wants codegen without the CLI:
read_abi(path) -> Manifest— parse + validate an ABI JSON file.generate_java(abi_path, out_dir, package)— Java adapter from a file.generate_rust_host(abi_path, out_dir)— Rust host adapter from a file.
The serde-serializable Manifest / Extension / Function / Param / Type structs are public; the ABI JSON schema is their serialized form.
API Docs
Java
Javadoc for the published artifact is served by javadoc.io, generated from the -javadoc jar that ships to Maven Central with every release:
https://javadoc.io/doc/com.hubspot/boomslang
Key entry points:
PythonExecutorFactory— build one per process; creates instances and owns the WASM thread pool.PythonInstance— a single execution context:execute,compile/loadCode,reset.PythonResult— captured stdout/stderr, exit code, timing.HostBridge— register Java handlers callable from Python.AsyncHostRegistry— broker for async host calls.ResourceLimits— per-instance output/memory caps.
Rust
Rustdoc for boomslang-hostgen (the extension DSL and codegen library) is built in CI and published with this site at /api/rust/.
The guest crates (boomslang-host-core, python-host) require the WASI/PyO3 build environment and are not yet on the docs site; read them in the repo.
Glossary
Boomslang sits at the intersection of three ecosystems (JVM, WebAssembly, CPython), and some names mean different things in each. This page is the tiebreaker.
| Term | Meaning |
|---|---|
| host | The outside process embedding boomslang.wasm and implementing its imports — the Java/Chicory runtime in the default setup, or Rust/Wasmtime in examples/rust-host/. |
| guest | Everything inside boomslang.wasm: the Rust glue code and the CPython interpreter it wraps. |
| host function | A function implemented by the host and imported by the guest — e.g. the Java handlers registered through HostBridge. This is Chicory/WASM terminology. |
python-host/ | Rust code compiled into the guest. The name is historical: this crate “hosts” CPython (via PyO3) inside the WASM module. It is not the WASM host. |
python-host-core/ | The reusable core of the guest, published as the crate boomslang-host-core. Custom Python builds compose this with their own extension crates. |
| extension | A set of typed host functions declared with the boomslang-hostgen DSL in a crate’s build.rs. The guest half is generated Rust; the host half is a generated Java or Rust adapter. |
boomslang (import module) | The WASM import namespace the guest expects its host functions under (e.g. boomslang.call, boomslang.log). |
boomslang_host (Python module) | The Python-side bridge module available to guest code: boomslang_host.call(...), boomslang_host.log(...), boomslang_host.asyncio. |
| golden snapshot | The pre-initialized guest memory image produced by Wizer at build time. New PythonInstances are copy-on-write views of it, which is why instance creation is milliseconds instead of a full CPython start. |
| prewarm | Importing a Python module during Wizer initialization so it is frozen into the golden snapshot. Prewarmed modules are served from the snapshot’s sys.modules at runtime. |
| AOT | Chicory’s ahead-of-time translation of the WASM module to JVM bytecode (com.hubspot.boomslang.compiled.*), avoiding interpretation overhead. |
| stdlib path | The host directory where the factory extracts packaged Python resources; instance roots are mounted from it. |
| overlay | Source-controlled files under core/src/main/resources/python-overlay/ copied over the generated Python tree at extraction time. |
Building from Source
The build is driven by Mill (build.mill); the justfile is a thin shim over Mill targets for common loops. Requirements: Java 21, Maven, just, and a container engine — Docker on Linux, Docker or Apple container on macOS.
With Nix, the dev shell provides Java 21, Maven, just, mdBook, Python 3, the WASI SDK, and the Maven JDK toolchain configuration required by basepom:
nix develop
A container engine still needs to be installed and running on the host for the full WASM pipeline.
Full build
./mill artifacts.installAll # native WASM artifacts (containers), Rust guest, Python resources
./mill build # Maven package incl. Java AOT classes
./mill test # integration tests
First runs take about an hour: CPython and the native libraries build inside containers.
Skipping the pipeline: fetch-main-wasm
For Java-only work you don’t need to build the runtime at all:
just fetch-main-wasm # installs the latest main runtime release assets into core resources
just build # package with AOT, skips tests
just test
just fetch-main-wasm downloads the latest successful main runtime artifact from GitHub release assets into core/src/main/resources/python/bin/ and python/usr/. Select a specific artifact with just fetch-main-wasm -- --sha <commit-sha> (or --branch <name>).
Mind the mismatch: fetched resources are built from main, not from your working tree. If your checkout contains Rust/guest changes, a fetched runtime silently won’t include them — rebuild with
just wasminstead.
Change loops
Java-only changes:
mvn compile -pl core
mvn test -pl tests
Rust/guest changes (python-host/, python-host-core/, extensions/):
just wasm # rebuild WASM + Wizer snapshot
just resources # repopulate Java resources
just build # rebuild Java AOT classes
just test
Python package (boomslang-py/ — see the Python host guide):
just python-stage # copy runtime resources + overlay into the package (needs fetch-main-wasm or resources first)
just python-test # staged resources + venv + pytest
just python-wheel # build dist/boomslang-<version>-py3-none-any.whl
CI attaches the wheel to GitHub releases (not PyPI).
Docs: mdbook serve docs (mdBook is in the Nix shell).
Container engine selection
Docker is the default. The selected engine is stored in the git-ignored .boomslang-container-cli file so Mill daemon builds see a stable input; the ./mill wrapper also writes it when BOOMSLANG_CONTAINER_CLI is set.
./mill artifacts.setContainerCli --cli docker # or: --cli container (Apple)
./mill artifacts.showContainerCli
Docker builds require BuildKit/buildx. For Apple container, run container system start first.
Pipeline stages
The native pipeline lives under cpython/, one container build per component:
just build-pydantic-core-wasi # ~15 min (Rust compilation)
just build-numpy-wasi # ~10 min
just build-pandas-wasi # ~10 min
just build-matplotlib-wasi # ~10 min
just build-pillow-wasi # ~10 min
just build-ijson-wasi # ~5 min
just build-cpython-wasi # ~20 min (links all of the above)
just pip-packages # pure-Python packages (pydantic, jinja2, ...)
just wasm # Rust guest + Wizer pre-init
just resources # populate core/src/main/resources
just build
just test
Inspect the artifact DAG and caching:
./mill artifacts.dag
./mill artifacts.dagDot
./mill artifacts.cacheStatus
./mill path artifacts.installAll artifacts.wasm
./mill plan artifacts.installAll prints execution order. To verify caching, run ./mill artifacts.installAll twice — the second run should skip task bodies.
CI
.github/workflows/build.yml rebuilds everything from source in containers, validates the generated runtime, runs the tests, and publishes runtime assets (wasm + resource tarball + checksums) to GitHub Releases — tagged releases for v* tags, build-<sha> prereleases for every main commit (these are what fetch-main-wasm consumes). docs.yml builds this book and deploys it to GitHub Pages on pushes to main.
Architecture
How a print('hello') gets executed, from the bottom up. Terms used precisely here are defined in the glossary — in particular host (the JVM/Wasmtime side) vs. guest (everything inside boomslang.wasm).
The guest: CPython on WASI
CPython 3.14 is compiled to wasm32-wasip1 with native extension modules (NumPy, Pandas, Matplotlib, Pillow, ijson, pydantic-core) statically linked — WASI has no dynamic linking. A thin Rust layer (python-host-core/, composed into the stock guest by python-host/) wraps the interpreter with PyO3 and exposes the base ABI: execute, compile_source, the output-buffer protocol, and friends.
Extensions add typed WASM imports: each is declared with the hostgen DSL, which generates the guest-side Rust (a Python module backed by WASM imports) and the host-side adapters from a shared ABI JSON.
The golden snapshot (Wizer)
Starting CPython — initializing the interpreter, importing NumPy and Pandas — takes seconds. Boomslang does it once, at build time: Wizer runs the guest’s initialization (including importing every prewarm module) and snapshots the resulting linear memory into the module itself. The shipped boomslang.wasm wakes up already initialized.
Consequence worth knowing: prewarmed modules live in the snapshot’s sys.modules. Changing their source on disk (e.g. via the resource overlay) has no effect until the WASM is rebuilt.
Copy-on-write instances
At runtime the Java host goes one step further. RuntimeImage instantiates the module once and keeps its post-initialization memory as the shared golden memory. Each PythonInstance gets a CopyOnWriteMemory: reads are served from the shared golden pages; writes materialize private copies of just the touched 64 KiB pages.
- Creating an instance is O(1) — no memory copy up front.
- Instances are isolated: one instance’s writes never affect another’s.
reset()discards the private pages, snapping the instance back to the pristine snapshot.
This is why the factory should be a process-wide singleton (it pins the golden memory) while instances are disposable.
AOT execution (Chicory)
Chicory can interpret WASM, but boomslang ships ahead-of-time compiled JVM bytecode for the bundled module: the chicory-compiler-maven-plugin translates boomslang.wasm into com.hubspot.boomslang.compiled.* classes at build time. The factory uses them when present (isAotAvailable()) and falls back to the interpreter otherwise — roughly an order of magnitude slower, so a missing-AOT warning in logs deserves attention.
Python is thus executing as JVM bytecode, JIT-compiled by HotSpot like everything else — no JNI, no native memory outside the Java heap.
Execution flow
Java caller
└─ factory.runOnWasmThread(...) dedicated thread, enlarged JVM stack
└─ instance.execute(src) PythonInstance, one per context
├─ alloc/write/execute base ABI calls into the guest
│ └─ PyO3 → CPython runs the script
│ └─ import boomslang_host → WASM imports → Java HostBridge handlers
└─ get_stdout/get_stderr captured output back to Java
Host functions are the only way out of the sandbox: the guest sees WASI (rooted at the instance directory) plus exactly the imports you registered.
The build pipeline
cpython/
pydantic-core-wasi ─┐
numpy-wasi ─────────┤
pandas-wasi ────────┤
matplotlib-wasi ────┼→ cpython-wasi → python-host (Rust guest) → Wizer → Java AOT
pillow-wasi ────────┤ (containers) (just wasm) (mvn)
ijson-wasi ─────────┘
Native components build in containers (cpython/builder/ provides the WASI SDK + Wizer + Binaryen image); Mill orchestrates the DAG and caches each stage. See Building from Source.
Repository Map
| Path | What it is |
|---|---|
core/ | Java runtime API (PythonExecutorFactory, PythonInstance, HostBridge, CopyOnWriteMemory) and the bundled Python resources. |
boomslang-py/ | Python host package: Sandbox API over wasmtime-py, shipped as a wheel bundling the WASM runtime (GitHub release asset). |
python-host/ | The stock Rust guest crate — composes boomslang-host-core with the built-in host-bridge extension and builds to boomslang.wasm. (Named for hosting CPython, not for being the WASM host; see the glossary.) |
python-host-core/ | Reusable guest core (crate boomslang-host-core): PyO3 wrapper, base ABI exports, init plumbing for extensions. |
extensions/ | Extension crates. host-bridge/ is the built-in boomslang_host.call/log bridge, including the boomslang_host Python package. |
boomslang-hostgen/ | The extension code generator: Rust DSL library + CLI, templates for guest Rust, Java adapters, and Rust/Wasmtime adapters. |
cpython/ | Native build pipeline: one *-wasi/ directory per native component (CPython itself, NumPy, Pandas, Matplotlib, Pillow, ijson, pydantic-core) and builder/, the shared container image (WASI SDK + Wizer + Binaryen + Rust). |
examples/custom-python-build/ | Building a custom guest with your own typed extensions. |
examples/rust-host/ | Embedding from Rust/Wasmtime with adapters generated from ABI JSON. |
tests/ | Java integration tests (run against the packaged runtime). |
benchmarks/ | JMH benchmarks. |
docs/ | This book (mdBook). |
build.mill, justfile | Build orchestration: Mill is the engine, just is the shim for common loops. |
scripts/ | Build support scripts, including fetch-main-runtime-resources.sh. |