Caffeine | thisContext

Archive for the Caffeine Category

livecoding new AI skills with conversational MCP

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Smalltalk, SqueakJS with tags AI, Caffeine, consulting, context, JavaScript, livecoding, MCP, reflection, smalltalk, squeak, SqueakJS on 24 September 2025 by Craig Latta

A person is speaking, indicated by a speech balloon full of colorful sprockets. A robot, with a head full of colorful sprockets, has grabbed one of the spoken sprockets. — *Through conversation, we can add skills to a model as new MCP tools.*

The Model Context Protocol gives us a way to extend the skills of an LLM, by associating a local function with natural-language descriptions of what it does, and making that tool available as additional conversational context. Once the model can infer from conversation that a tool should be used, it invokes the tool’s function via the protocol. Tools are made available by an MCP server, augmenting models with comprehensive sets of related skills. There are MCP servers for filesystem access, image processing, and many other domains. New registries are emerging to provide MCP server discovery, and old ones (like npm) are becoming good sources as well.

For someone who can create the underlying tool functions, a personal MCP server can supercharge a coding assistant with project-specific skills. I’ve added an MCP server proxy to my Caffeine server bridge, enabling a Smalltalk IDE running in a web browser to act as an MCP server, and enhancing the AI conversations that one may have in the IDE. I can dynamically add tools, either through conversation or in Smalltalk code directly.

anatomy of an MCP tool

In modeling an MCP tool, we need to satisfy the denotational requirements of both the protocol (to enable model inference) and the environment in which the functions run (to enable correct function invocation). For the protocol, we must provide a tool name and description, and schemas for the parameters and the result. For the Smalltalk environment, we need the means to derive, from invocation data, the components of a message: a receiver, selector, and parameters.

In my implementation, an instance of class FunctionAITool has a name, description, result, parameters, selector, and a receiver computation. The receiver computation is a closure which answers the intended receiver. Each of the parameters is an instance of class FunctionAIToolParameter, which also has a name and description, as well as a type, a boolean indicating whether it is required, and a parameter computation similar to the receiver computation. The tool’s result is also a FunctionAIToolResult, which can have a result computation but typically doesn’t, since the computation is usually done by the receiver performing the selector with the parameters. The result computation can be used for more complex behavior, extending beyond a single statically-defined method.

The types of the result and each parameter are usually derived automatically from pragmas in the method of the receiver named by the selector. (The developer can also provide the types manually.) These pragmas provide type annotations similar to those used in generating the WebAssembly version of the Smalltalk virtual machine. The types themselves come from the JSON Schema standard.

routing MCP requests from the model to the environment

Caffeine is powered by SqueakJS, Vanessa Freudenberg’s Smalltalk virtual machine in JavaScript. In a web browser, there’s no built-in way to accept incoming network connections. Using Deno integration I wrote, a server script acts as a proxy for any number of remote peers with server behavior, connected via websockets. In the case of MCP, the script can service some requests itself (for example, during the initialization phase). The script also speaks Caffeine’s Tether remote object messaging protocol, so it can forward client requests to a remote browser-based Caffeine instance using remote messages. That Caffeine instance can service requests for the tools list, and for tool calls. The server script creates streaming SSE connections, so that both client and server can stream notifications to each other over a persistent connection.

Caffeine creates its response to a tools list request by asking each tool object to answer its metadata in JSON, and aggregating the responses. To service a tool call, if the tool has a selector, it uses the receiver computation to get a receiver, and the parameter computations to both validate each client-provided parameter and derive an appropriate Smalltalk object for it. Now the receiver can perform the tool’s selector with the parameters. If the tool has a result computation instead of a selector, the tool will evaluate it with the derived parameters. A tool’s result can be of a simple JSON type, or something more sophisticated like a Smalltalk object ID, for use as an object reference parameter in a future tool call.

conversational tool reflection

For simple tools, it’s sometimes useful to keep development entirely within a conversation, without having to bring up traditional Smalltalk development tools at all. There’s not much motivation for this when the conversation is in a Smalltalk IDE, where conversations can happen anywhere one may enter text, and the Smalltalk tools are open anyway. But it can transform a traditional chatbot interface.

This extends to the development of the tools themselves. I’ve developed a tool that can create and edit new tools. I imagine conversations like this one:

*Creating a new MCP tool through conversation.*

What would you do with it?

This capability could enable useful Smalltalk coding assistants. I’m especially keen to see how we might create tools for helping with debugging. What would you do?

AI-assisted just-in-time compilation in Catalyst

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Smalltalk, Spoon, SqueakJS with tags Caffeine, consulting, context, JavaScript, livecoding, smalltalk, spoon, squeak, SqueakJS on 22 July 2025 by Craig Latta

*AI can help us find the “desire paths” in stack instructions.*

There’s a long tradition of just-in-time compilation of code in livecoding systems, from McCarthy’s LISP systems in the 1960s, to the dynamic method translation of the Deutsch-Schiffman Smalltalk virtual machine, to the “hotspot” compilers of Self, Strongtalk and Java, to current implementations for JavaScript and WebAssembly in Chrome’s V8 and Firefox’s SpiderMonkey. Rather than interpret sequences of virtual machine instructions, these systems translate instruction sequences into equivalent (and ideally more efficient) actions performed by the instructions of a physical processor, and run those instead.

We’d like to employ this technique with the WASM GC Catalyst Smalltalk virtual machine as well. Translating the instructions of a Smalltalk compiled method into WASM GC instructions is straightforward, and there are many optimizations we can specify ahead of time for optimizing those instructions. But with the current inferencing abilities of artificial intelligence large language models (LLMs), we can leave even that logic until runtime.

dynamic method translation by LLM

Since Catalyst runs as a WASM module orchestrated by SqueakJS in a web browser, and the web browser has JavaScript APIs for WASM interoperation, and there are JS APIs for interacting with LLMs, we can incorporate LLM inference in our translation of methods to WASM functions. We just need an expressive system for composing appropriate prompts. Using the same Epigram compilation framework that enables the decompilation of the Catalyst virtual machine itself into WASM GC, we can express method instructions in a prompt, by delegating that task to the reified instructions themselves.

For an example, let’s take the first method developed for Catalyst to execute, SmallInteger>>benchmark, a simple but sufficiently expensive benchmark. It repeats this pattern five times: add one to the receiver, multiply the result by two, add two to the result, multiply the result by three, add three to the result, and multiply the result by two. This is trivial to express as a sequence of stack operations, in both Smalltalk instructions and WASM instructions.

Our pre-written virtual machine code can do the simple translation between those instruction sets without using an LLM at all. With a little reasoning, an LLM can recognize from those instructions that something is being performed five times, and write a loop instead of inlining all the operations. With a little more reasoning, it can do a single-cycle analysis and discover the algebraic relationship between the receiver and the output (248,832n + 678,630). That enables it to write a much faster WASM function of five instructions instead of 62.

the future

This is a contrived example, of course, but it clearly shows the potential of LLM-assisted method translation, at least for mathematical operations. I’ve confirmed that it works in Catalyst, and used the results to populate a polymorphic inline cache of code to run instead of interpretation. Drawing inspiration from the Self implementation experience, what remains to be seen is how much time and money is appropriate to spend on the LLM. This can only become clear through real use cases, adapting to changing system conditions over time.

2 Comments »

Catalyst update: a WASM GC Smalltalk virtual machine and object memory are running

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Naiad, Smalltalk, Spoon, SqueakJS with tags Caffeine, consulting, context, JavaScript, livecoding, naiad, smalltalk, spoon, squeak, SqueakJS on 29 June 2025 by Craig Latta

I’ve bootstrapped a Smalltalk virtual machine and object memory as a WebAssembly (WASM) module, using the type system that supports garbage collection there. I have two motivations for doing this: I’d like to see how fast it can run, and I’d like to see how it can interoperate with other WASM modules in diverse settings, including web browsers, servers, and native mainstream OS apps.

the current test: evaluating (3 squared)

The very first thing I ran was a method adding three and four, reporting the result through a webpage. This required types and classes for Object, SmallInteger, ByteArray, Dictionary, Class, CompiledMethod, Context, and Process, a selectors for #+, and functions for interpreting bytecodes, creating arrays, dictionaries, contexts, and the initial object memory, manipulating dictionaries, stacks, contexts, and for reporting results to JavaScript.

Evaluating (3 + 4) only uses an addition bytecode, instead of actually sending a message. After I got a successful result, I changed the expression to (3 squared). This tested sending an actual message, creating a context for the computation, invoking a method different from the one sending the message.

Using WASM’s JavaScript interoperation facilities, I export two WASM functions to JS for execution in a web browser: createMinimalBootstrap() and interpret(). The createMinimalBootstrap function creates classes, selectors, an unbound method that sends “squared” and an initial context for it, and a “squared” method installed in class SmallInteger, and initializes interpreter state.

With the interpreter and object memory set up, JS can tell the WASM module to start interpreting bytecodes, with interpret(). The initial unbound method, after the bytecodes for performing (3 squared), has a special bytecode for reporting the result to JS. It calls a JS function imported from the webpage, which simply prints the result in the webpage. The webpage also reports how long the interpreter takes to run; it might be interesting when measuring the speed of methods compiled to WASM functions.

the interpreter

*the SqueakWASM virtual machine returning a result to JavaScript*

The interpreter implements all the traditional Smalltalk bytecodes, and a few more for interoperating with JavaScript. Smalltalk objects are represented with WASM GC reference types: i31refs for SmallIntegers, and structrefs for all other objects. There is a type hierarchy mirroring the Smalltalk classes used in the system. With all objects implemented as reference types, rather than as byte sequences in linear WASM memory, we can leverage WASM’s garbage collector. This is similar to the way SqueakJS leverages the JS runtime engine’s garbage collector in a web browser. Also like SqueakJS, SmallIntegers larger than a signed 31-bit integer are boxed LargeIntegers, as WASM GC doesn’t yet have a built-in reference type for 63-bit signed integers.

the next test: just-in-time compilation of methods to WASM

Now that I can run SmallInteger>>squared as interpreted bytecodes, I’ll write a rudimentary translator from bytecode sequences to WASM functions. It may provide an interesting micro-benchmark for comparing execution speeds.

future work: reading snapshots and more

Obviously, a Smalltalk virtual machine does many things; this is a tiny but promising beginning. In the near future, I’d like to support reading existing Squeak, Pharo, and Cuis object memories, provide more extensive integration with device capabilities through JavaScript, web browsers, and WASI, and support the Sista instruction set for compatibility with the OpenSmalltalk Cog virtual machine. I’m especially interested to see how SqueakWASM might integrate with other WASM modules in the wild.

What would you do with WASM Smalltalk? Please let me know!

1 Comment »

a Tethered server-side display

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Smalltalk, SqueakJS with tags Caffeine, consulting, context, JavaScript, livecoding, smalltalk, squeak, SqueakJS on 18 May 2025 by Craig Latta

Server-side Caffeine can display in your web browser.

Tether is Caffeine‘s remote messaging protocol. It enables messaging between multiple Caffeine systems running in web browsers, other JavaScript runtimes, or native apps, using TCP sockets, Web Sockets, or WebRTC data channels. With the Deno JavaScript runtime, Caffeine can run server-side in a Web Worker thread, with the main thread providing a websocket for Caffeine to use, and acting as a bridge for Tether traffic between the worker and connected clients. Every system speaking the protocol does so with a local instance of class Tether.

Usually, the services provided by the worker Caffeine don’t involve graphics, and the system is headless. However, we can also perform all the traditional graphics behavior, using a remote Caffeine instance as the display.

a networked graphics pipeline

The main Deno thread, the bridge, communicates with the Caffeine worker thread through message handlers invoked on each side with postMessage(). The bridge communicates with a client Caffeine system over a websocket, TCP socket, or WebRTC data channel. The key to remote display is in the worker’s implementation of interpreter.showFormAt():

When the system is headless, the HMTL canvas graphics context that would normally be used for image data is null. The interpreter instead uses postMessage() to pass the pixel data to the bridge. The bridge can then broadcast those pixels to interested clients. Normally, the bridge only relays remote messages and their answers amongst the worker and the clients. We can consider a display update as the parameter for an invocation of a virtual block closure callback, handled specially by each receiving Tether. When such an update is received by a system hosted in a web browser, the Tether puts image data to the graphics context of a local HTML canvas, just as the original system would have done normally. The worker’s BitBLT is invoked normally to calculate the pixels, and can use WebAssembly to speed things up.

on-demand display enables livecoded server development

When the worker is truly headless, with no graphics-interested clients connected to the bridge, we don’t have to pay the cost of BitBLT/WASM graphics calculations. We can tell from the worker’s Smalltalk object memory that the system is headless, and avoid even asking the interpreter to calculate pixels. But a display can be provided on demand, enabling the livecoding of server apps. This makes the development process much more efficient, by eliminating the loops of implementation and testing that an always-headless server would cause.

With this ability, I’m writing a dynamic MCP server framework that accepts websocket connections from AI models, and mediates model communication with the subject app. Being able to livecode this will provide further efficiency benefits, by never having to interrupt the model’s interaction with either the MCP server or the subject app.

What would you use it for?

a voice-controlled Ableton Live

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, music, Smalltalk, SqueakJS with tags Ableton, AI, Caffeine, consulting, context, JavaScript, livecoding, LLM, smalltalk, squeak, SqueakJS on 5 January 2025 by Craig Latta

*These robots know the value of keeping your hands on your instrument!*

I’ve gotten up to speed on AI programming, and it didn’t hurt a bit. After learning the OpenAI chat and realtime APIs, I’m able to integrate generative AI text and speech into Caffeine. I got off to a good start by adding the ability to evaluate natural language with the same tools used to evaluate Smalltalk expressions in text editors. For my next application, I’m writing voice control for Ableton Live. This will let me keep my hands on a musical instrument instead of the keyboard or mouse while using Live.

This ties together my new realtime OpenAI client with an enabling technology I wrote previously: programmatic control of Live from a web browser. The musician speaks into a microphone, and a large language model translates their words into code that Caffeine can run on the Live API. With AI, the spoken commands can be relatively specific (“fade out at the end”) or very abstract (“use foreboding chords”).

The realtime OpenAI client uses a WebRTC audio channel to send spoken commands to the language model, and a WebRTC data channel to answer text as JSON data. I expect I’ll use system prompts instructing the model to respond using a domain-specific language (DSL) that can be run fairly directly by Caffeine. This will take the form of OpenAI tools that associate natural-language function descriptions with function signatures. The AI can deduce from conversational context when it should call a function, and the functions run locally (not on the OpenAI servers). I’ll define many functions that control various aspects of Ableton Live; prompts given by the musician will invoke them.

I imagine the DSL will be a distillation of the most commonly-used functions in Ableton Live’s very large API, and that it’ll emerge from real DAW use. What would you want to say to your DAW?

Context-Aware AI Conversations in Smalltalk

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Smalltalk, SqueakJS with tags 1337, AI, Caffeine, debugging, JavaScript, livecoding, OOP, smalltalk, squeak, SqueakJS, web services on 3 December 2024 by Craig Latta

There’s a lot of Smalltalk knowledge in the pre-training data of most LLMs.

I’ve been stumbling toward a “good enough” understanding of Smalltalk by an AI large language model, and Smalltalk tools for integrating conversations into the workflow. So far, I’ve been doing this through model fine-tuning with English system prompts, without resorting to code at all. I’ve been impressed with the results. It seems the pre-training that the OpenAI gpt-4o model has about Smalltalk and Squeak is a decent basis for further training. I evolve the prompts in response to chat completion quality (usually by applying more constraints, like “When writing code, don’t send a message to access an object when you can access it directly with an instance variable.”).

I wanted to converse with the language model from any text pane in Squeak, via the classic “do it”, “print it”, and “inspect it” we’re used to using with Smalltalk code. I changed Compiler>>evaluateCue:ifFail: to handle UndeclaredVariable exceptions, by delegating to the model object underlying the text pane in use. (It’s usually an UndeclaredVariable exception that happens first when one attempts to evaluate an English phrase. For example, “What” in “What went wrong?” is unbound.) That model object, in turn, handles the exception by interpreting the next chat completion from the language model.

The model objects I’ve focused on so far are instances of Debugger and Inspector. One cute thing about this approach is that it records do-its for English prompts just like it does for Smalltalk code, in the changes log. Each model can supply its own system prompts to orient conversations, and can interpret chat completions in a variety of ways (like running Smalltalk code written by the language model). Each model object also keeps a reference to its most recent chat completion, so that successive prompts are submitted to the language model in the context of the complete conversation so far.

With all this in place, evaluating “What went wrong?” in a debugger text pane gives surprisingly correct, detailed, and useful answers. Running the code answered to “Write code for selecting the most recent context with a BlockClosure receiver.” manipulates the debugger correctly.

Next, I’m experimenting with prompts for describing an application’s domain, purpose, and user interface. I’m eager to see where this leads. :)

1 Comment »

dynamic translation of Smalltalk to WebAssembly

Posted in Caffeine, consulting, Context, livecoding, Smalltalk, Spoon, SqueakJS with tags Caffeine, context, JavaScript, livecoding, smalltalk, squeak, SqueakJS, WebAssembly on 26 July 2023 by Craig Latta

In Catalyst, a WebAssembly implementation of the OpenSmalltalk virtual machine, there are three linguistic levels in play: Smalltalk, JavaScript (JS), and WebAssembly (WASM). Smalltalk is our primary language, JS is the coordinating language of the hosting environment (a web browser), and WASM is a high-performance runtime instruction set to which we can compile any other language. In a previous article, I wrote about automatic translation of JS to WASM, as a temporary way of translating the SqueakJS virtual machine to WASM. That benefits from a proven JS starting point for the relatively large codebase of the virtual machine. When translating individual Smalltalk compiled methods for “just-in-time” optimization, however, it makes more sense to translate from Smalltalk to WASM directly.

compiled method transcription

We already have infrastructure for transcribing Smalltalk compiled methods, via class InstructionStream. We use it to print human-readable descriptions of method instructions, and to simulate their execution in the Smalltalk debugger. We can also use it to translate a method to human-readable WebAssembly Text (WAT) source code, suitable for translation to binary WASM code which the web browser can execute. Since the Smalltalk and WASM instruction sets are both stack-oriented, the task is straightforward.

I’ve created a subclass of InstructionStream, called WATCompiledMethodTranslator, which uses the classic scanner pattern to drive translation from Smalltalk instructions to WASM instructions. With accompanying WASM type information for Smalltalk virtual machine structures, we can make WASM modules that execute the instructions for individual Smalltalk methods.

the “hello world” of Smalltalk: 3 + 4

As an example, let’s take a look at translating the traditional first Smalltalk expression, 3 + 4. We’ll create a Smalltalk method in class HelloWASM from this source:

HelloWASM>>add
	"Add two numbers."

	^3 + 4

This gives us a compiled method with the following Smalltalk instructions. On each line below, we list the program counter value, the instruction, and a description of the instruction.

0: 0x20: push the literal constant at index 0 (3) onto the method's stack
1: 0x21: push the literal constant at index 1 (4) onto the method's stack
2: 0xB0: send the arithmetic message at index 0 (+)
3: 0x7C: return the top of the method's stack

A WATCompiledMethodTranslator uses an instance of InstructionStream as a scanner of the method, interpreting each Smalltalk instruction in turn. When interpreting an instruction, the scanner sends a corresponding message to the translator, which in turn writes a transcription of that instruction as WASM instructions, onto a stream of WAT source.

The first instruction in the method is “push the literal constant at index 0”. The scanner finds the indicated literal in the literal frame of the method (i.e., 3), and sends pushConstant: 3 to the translator. Here are the methods that the translator runs in response:

WATCompiledMethodTranslator>>pushConstant: value
	"Push value, a constant, onto the method's stack."

	self
		comment: 'push constant ', value printString;
		pushFrom: [value printWATFor: self]

WATCompiledMethodTranslator>>pushFrom: closure
     "Evaluate closure, which emits WASM instructions that push a value onto the WASM stack. Emit further WASM instructions that push that value onto the Smalltalk stack."

	self
		setElementAtIndexFrom: [
			self
				incrementField: #sp
				ofStructType: #vm
				named: #vm;
				getField: #sp
				ofStructType: #vm
				named: #vm]
		ofArrayType: #pointers
		named: #stack
		from: closure

WATCompiledMethodTranslator>>setElementAtIndexFrom: elementIndexClosure ofArrayType: arrayTypeName named: arrayName from: elementValueClosure
	"Evaluate elementIndexClosure to emit WASM instructions that leave an array index on the WASM stack. Evaluate elementValueClosure to emit WASM instructions that leave an array element value on the WASM stack. Emit further WASM instructions, setting the element with that index in an array of the given type and variable name to the value."

	self get: arrayName.
	{elementIndexClosure. elementValueClosure} do: [:each | each value].

	self
		indent;
		nextPutAll: 'array.set $';
		nextPutAll: arrayTypeName

In the final method above, we finally see a WASM instruction, array.set. The translator implements stream protocol for actually writing WAT text to a stream. The comment:, get:, and getField:ofStructType:named: methods are similar, using “;;” and the array.get and struct.get WASM instructions. The array and struct instructions are part of the WASM garbage collection extension, which introduces types.

WASM types for virtual machine structures

To actually use WASM instructions that make use of types, we need to define the types in our method’s WASM module. In pushFrom: above, we use a struct variable of type vm named vm, and an array variable of type pointers named stack. The vm variable holds global virtual machine state (for example, the currently executing method’s stack pointer), similar to the SqueakJS.vm variable in SqueakJS. The stack variable holds an array of Smalltalk object pointers, constituting the current method’s stack. In general, the WASM code for a Smalltalk method will also need fast variable access to the active Smalltalk context, the active context’s stack, the current method’s literals, and the current method’s temporary variables.

Our WASM module for HelloWASM>>add might begin like this:

(module
	(type $bytes (array (mut i8)))
 	(type $words (array (mut i32)))
 	(type $pointers (array (ref $object)))

 	(type $object (struct
		(field $metabits (mut i32))
		(field $class (ref $object))
		(field $format (mut i32))
		(field $hash (mut i32))
		(field $pointers (ref $pointers))
		(field $words (ref $words))
		(field $bytes (ref $bytes))
		(field $float (mut f32))
		(field $integer (mut i32))
		(field $address (mut i32))
		(field $nextObject (ref $object))))

	(global $vm (struct
		(field $sp (mut i32))
		(field $pc (mut i32)))

	(global $stack (array (ref $pointers)))

	(function $HelloWASM_add
		;; pc 0
		;; push constant 3
		global.get $stack
		global.get $vm
		global.get $vm
		struct.get $vm $sp
		i32.const 1
		i32.add
		struct.set $vm $sp ;; increment the stack pointer
		global.get $vm
		struct.get $vm $sp
		i32.const 3
		array.set $pointers
		
		;; pc 1
		...

As is typical with assembly-level code, there’s a lot of setup involved which seems quite verbose, but it enables fast paths for the execution machinery. We’re also effectively taking on the task of writing the firmware for our idealized Smalltalk processor, by setting up interfaces to contexts and methods, and by implementing the logic for each Smalltalk instruction. In a future article, I’ll discuss the mechanisms by which we actually run the WASM code for a Smalltalk method. I’ll also compare the performance of dynamic WASM translations of Smalltalk methods versus the dynamic JS translations that SqueakJS makes. I don’t expect the WASM translations to be much (or any) faster at the moment, but I do expect them to get faster over time, as the WASM engines in web browsers improve (just as JS engines have).

5 Comments »

automated translation of JavaScript to WebAssembly for SqueakJS

Posted in Caffeine, Naiad, Smalltalk, Spoon, SqueakJS with tags smalltalk, squeak on 6 July 2023 by Craig Latta

*creating WASM from JS is a bit like creating DNA from proteins*

After creating a working proof-of-concept Squeak Smalltalk virtual machine with a combination of existing SqueakJS code and handwritten WASM (for the instruction functions), I set about automating the generation of WASM from JS for the rest of the functions. (A hybrid WASM/JS virtual machine has poor performance, because of the overhead of calling JS functions from WASM.) Although I expect eventually to write a JS parser with Epigram, for now I’m using the existing JS parser Esprima, via the JS bridge in SqueakJS. (The benefits of using Epigram here will be greatly improved debugging, portability to non-JS platforms, and retention of parsed comments.) After parsing the SqueakJS VM code into Smalltalk objects representing JS parse nodes, I’m using those objects’ WASM generation behavior to generate a WASM-only virtual machine. I’m taking advantage of the newly-added type management instructions added to WASM, as part of its garbage-collection proposal.

type hinting

To make effective use of those instructions, we need the JS code to give some hints about object structure. For example, the SqueakJS at-cache uses JS objects whose structure is emergent, rather than defined explicitly in advance. If SqueakJS were written in TypeScript, where all structures are defined in advance, we would have this information already. Instead, I add a prototype JS object to the at-cache object, describing the type of an at-cache entry:

this.atCacheEntryPrototype = {
			"array": [],
			"convertChars": true,
			"size": 0,
			"ivarOffset": 0}

this.atCachePrototype = [this.atCacheEntryPrototype]
this.atCache = []
this.atCache.prototype = this.atCachePrototype

When generating WASM source, an assignment parse node can check to see if its name ends with “Prototype”, and create type information instead of generating source. The actual JS code for setting a prototype does practically nothing at VM runtime, so has no impact on performance. Types are cached by the left-side node of an assignment expression, and by the outermost scope in a registry of all types. The types themselves are instances of a WASMType hierarchy. They can print WASM for themselves, and assist in printing the WASM for structs that use them.

Overall, I prefer to keep the SqueakJS implementation in JS rather than TypeScript, to keep the fully dynamic style. These prototype annotations are small and manageable.

further JIT optimization

After I’ve got WASM source for the complete virtual machine, I plan to turn my attention to the SqueakJS JIT. This translates Smalltalk compiled method instructions to JS code, which in turn is compiled to physical processor instructions by the JS execution engine. It may be that the web browser can generate more efficient native code from WASM we’ve generated from the generated JS code. It will be good to measure it.

1 Comment »

a WebAssembly Squeak virtual machine is running

I’ve replaced the inner instruction-dispatch loop of a running SqueakJS virtual machine with a handwritten WebAssembly (WASM) function, and run several thousand instructions of the Caffeine object memory. The WASM module doesn’t yet have its own memory. It’s using the same JavaScript objects that the old dispatch loop did, and the supporting JS state and functions (like Squeak.Interpreter class). I wrote a simple object proxy scheme, whereby WASM can use unique integer identifiers to refer to the Smalltalk objects.

Because of this indirection, the current performance is very slow. The creation of an object proxy is based on stable object pointer (OOP) values; young objects require full garbage collection to stabilize their OOPs. There is also significant overhead in calling JavaScript functions from WASM. At this stage, the performance penalties are worthwhile. We can verify that the hybrid JS/WASM interpreter is working, without having to write a full WASM implementation first.

a hybrid approach

My original approach was to recapitulate the Slang experience, by using Epigram to decompile the Smalltalk methods of a virtual machine to WASM. I realized, though, that it’s better to take advantage of the livecoding capacity of the SqueakJS VM. I can replace individual functions of the SqueakJS VM, maintaining a running system all the while. I can also switch those functions back and forth while the system is running, perhaps many millions of instructions into a Caffeine session. This will be invaluable for debugging.

The next goal is to duplicate the object memory in a WASM memory, and operate on it directly, rather than using the object proxy system. I’ll start by implementing the garbage collector, and testing that it produces correct results with an actual object memory, by comparing its behavior to that of the SqueakJS functions.

Minimal object memories will be useful in this process, because garbage collection is faster, and there is less work to do when resuming a snapshot.

performance improvement expected

From my experiment with decompiling a Smalltalk method for the Fibonacci algorithm into WASM, I saw that WASM improves the performance of send-heavy Smalltalk code by about 250 times. I was able to achieve significant speedups from the targeted use of WASM for the inner functions of BitBLT. From surveying performance comparisons between JS and WASM, I’m expecting a significant improvement for the interpreter, too.

5 Comments »

tactical Squeak speedups with WebAssembly

With the JavaScript bridge in SqueakJS, we can utilize built-in web browser behavior and other JS frameworks from Smalltalk, just as any other JS code would. I’ve used it to build Caffeine apps using A-Frame and croquet.io. Another useful framework we can integrate is WebAssembly (WASM), a stack-oriented instruction set for writing high-performance code. I have begun to identify performance-critical code in the SqueakJS virtual machine, and replace it with WASM code. The initial results are encouraging and useful.

identifying hotspots

I’m running SqueakJS in the Chrome web browser. To identify virtual machine code that consumes the most time, I profile use cases that seem slow, using Chrome’s built-in devtools. The first use case I chose was drag-selecting a large quantity of text in a workspace.

*a performance capture of drag-selecting text, indicating that rgbMulwith() is the most time-consuming inner BitBLT function*

From reviewing a performance capture of this use case, we can see that rgbMulwith() is the most time-consuming inner function from the BitBLT plugin. While it doesn’t modify variables in outer scopes, it does read a plugin-global variable. Most of the work it does, however, is done by partitionedMulwithnBitsnPartitions(), a pure function returning the result of a mathematical operation on the inputs, without any other system state interaction. That makes it well-suited to WASM implementation.

While there are APIs for coordinating side effects with JavaScript, they are relatively slow. It is therefore harder to rationalize WASM implementations of individual higher-level Squeak virtual machine primitives, since they interact extensively with complex JavaScript objects like the Squeak interpreter. Eventually, we’ll represent the entire Squeak object memory inside a WASM memory, and implement the entire Squeak virtual machine with WASM functions. WASM garbage collection will assist the Squeak garbage collector, much as the JavaScript garbage collector assists the SqueakJS VM now. JavaScript interaction will be limited to the WASM implementation of the SqueakJS JS bridge.

translating from JS to WASM

Here’s the existing JS implementation of partitionedMulwithnBitsnPartitions():

With its stack-based instructions, WASM code is reminiscent of Smalltalk bytecode. Here’s some of the equivalent WASM implementation of the above function, written by hand. The WASM memory holds the maskTable from BitBitPlugin.js.

Note that WASM’s shift-left and shift-right instructions are fine as is; we don’t need to make wrapper functions for them as we did in JS.

After I modified the BitBLT plugin so that rgbMulwith() uses partitionedMUL(), drag-selecting text in the Caffeine user interface was much more responsive, and a different inner BitBLT plugin function was the most time-consuming. Even though rgbMulwith() used a small percentage of total time in the first performance capture, every saved millisecond significantly improves perceived animation smoothness. By using additional use cases (scrolling long lists, and repainting by alternating the stacking order of two windows), I identified other inner BitBLT plugin functions to optimize. The Caffeine user interface is now much more responsive than it was. This is especially useful with Worldly, the spatial IDE I’m building with Caffeine and A-Frame, where every bit of performance matters.

an alternative to writing WASM by hand

For the JS code in the Squeak virtual machine, it makes sense to write replacement WASM code by hand. Since WASM code is so similar to Smalltalk bytecode, for Smalltalk compiled methods it makes more sense to use automated decompilation to WASM. I have done this for a small proof-of-concept, using a Smalltalk compiled method for the Fibonacci algorithm.

Using the Smalltalk compiler and decompiler I wrote with my Epigram parsing framework, I was able to decompile the Smalltalk compiled method for the Fibonacci algorithm into WASM text. I then used an in-browser version of the WebAssembly Binary Toolkit from Caffeine to generate binary WASM, compile it in the current page as a function, and call the function. Comparing the execution time of finding the 29th Fibonacci number in both Smalltalk and WASM showed that WASM had 250 times the execution speed of the normal SqueakJS bytecode-to-JS translator.

I plan to write, in Smalltalk, a version of the Squeak virtual machine simulator that stores all objects in a WASM memory. Once it can evaluate (3 + 4), I’ll translate all its Smalltalk compiled methods to WASM, and see how much faster it runs. The next step will be to get a JS bridge working, and implement interfaces to the web browser DOM for graphics and user input event handling. Ultimately, a WASM implementation of the Squeak virtual machine may be preferable to the SqueakJS virtual machine.

3 Comments »

thisContext

Archive for the Caffeine Category

livecoding new AI skills with conversational MCP

anatomy of an MCP tool

routing MCP requests from the model to the environment

conversational tool reflection

What would you do with it?

AI-assisted just-in-time compilation in Catalyst

dynamic method translation by LLM

the future

Catalyst update: a WASM GC Smalltalk virtual machine and object memory are running

the current test: evaluating (3 squared)

the interpreter

the next test: just-in-time compilation of methods to WASM

future work: reading snapshots and more

a Tethered server-side display

a networked graphics pipeline

on-demand display enables livecoded server development

a voice-controlled Ableton Live

Context-Aware AI Conversations in Smalltalk

dynamic translation of Smalltalk to WebAssembly

compiled method transcription

the “hello world” of Smalltalk: 3 + 4

WASM types for virtual machine structures

automated translation of JavaScript to WebAssembly for SqueakJS

type hinting

further JIT optimization

a WebAssembly Squeak virtual machine is running

a hybrid approach

performance improvement expected

tactical Squeak speedups with WebAssembly

identifying hotspots

translating from JS to WASM

an alternative to writing WASM by hand

search

tools

highlights

Archive for the Caffeine Category

anatomy of an MCP tool

routing MCP requests from the model to the environment

conversational tool reflection

What would you do with it?

Share this:

dynamic method translation by LLM

the future

Share this:

the current test: evaluating (3 squared)

the interpreter

the next test: just-in-time compilation of methods to WASM

future work: reading snapshots and more

Share this:

a networked graphics pipeline

on-demand display enables livecoded server development

Share this:

Share this:

Share this:

compiled method transcription

the “hello world” of Smalltalk: 3 + 4

WASM types for virtual machine structures

Share this:

type hinting

further JIT optimization

Share this:

a hybrid approach

performance improvement expected

Share this:

identifying hotspots

translating from JS to WASM

an alternative to writing WASM by hand

Share this:

search

tools

highlights