Archive for the Spoon Category

Catalyst phase four: decompiling the virtual machine simulator to WebAssembly as it runs

Posted in Appsterdam, consulting, Context, livecoding, Smalltalk, Spoon, SqueakJS with tags , , , , , , , on 10 September 2025 by Craig Latta
A person operating a marionette, who is himself a marionette, operated by a hand of unknown provenance. Copyright (c) 2025 Craig Latta.
Through simulation, we can specify the behavior of the virtual machine exactly.

I’m writing phase four of the Catalyst Smalltalk virtual machine, producing a WASM GC version of the virtual machine from a Smalltalk implementation. WASM GC is statically typed. While I prefer the dynamically-typed livecoding style of Smalltalk, I want the WASM GC representation to be thoroughly idiomatic. This presents an opportunity to revisit the process of generating code from the virtual machine simulation; we can use the simulation to be precise about types, while livecoding Smalltalk as before.

imprinting returns

As part of earlier work I did to produce a minimal object memory, I developed a facility for imprinting behavior (methods) onto one system from another, as a side-effect of running them. We can use a similar technique to imprint the methods of a virtual machine simulation onto WASM GC code files. This affords some interesting possibilities. If we construct a minimal object memory as part of the simulation, we can ensure the virtual machine is also minimal, containing only the code necessary for running that object memory. That might have useful security properties.

We also have the traditional correctness proof that a virtual machine generated from its simulation gives us. The generated virtual machine can run that object memory, since the simulation could run it. But by imprinting the generated virtual machine from a running simulation, rather than statically, we have also have a stronger proof of type correctness. We don’t need to infer types, since we have live objects from the simulation at hand. This lets Catalyst take full advantage of the optimizations in the WASM GC development environment (e.g., ‘wasm-opt‘) and runtime environment (e.g., V8). It’s also easier to create fast virtual machines for particular debugging situations; you can change a running simulation that has come upon a problem, and generate the corresponding WASM virtual machine module quickly.

the details

This approach is enabled by more of my previous work. The interpretation of compiled methods is done by reified virtual machine instructions (“bytecodes”). Those instructions are initialized using the Epigram parsing and compilation framework I wrote. Epigram uses reified BNF grammar production rules to parse source code, and to form an abstract syntax tree. Each instruction has a code generator object corresponding to the language element it manipulates (e.g., an instance variable). Each code generator has a copy of the production rule that justifies the existence of the language element.

For example, in the Smalltalk grammar, a pseudo-variable reference (“self”, “super”, or “thisContext”) is parsed by the PseudoVariableReference production rule, a shared variable in a pool dictionary of Smalltalk production rules. A ReturnReceiver instruction has a SmalltalkReceiver code generator, which in turn has a copy of the PseudoVariableReference rule. Production rule copies are used to do the actual work of parsing a particular source; they hold parsed data (e.g., the characters for “self”), also known as terminals. The original production rule objects exist solely to define a grammar.

After being initialized from the rules used to parse the source code of a Smalltalk method, instruction objects can assist a virtual machine and its active context with interpretation. They can also generate source code, either in Smalltalk or some other language, with the assistance of their code generators. For WASM GC, code generation happens during interpretation, so that the types of all variables and return values are known concretely, and generation can be contingent on successful type validation.

type annotation and representation

For this approach to work, we need a source for that type information. I use pragmas contained in the methods of the simulation. Each Smalltalk method that will be translated to a WASM GC function has a pragma for each parameter and temporary variable, and another for the return value (if the WASM GC function is to have a return value). Every Smalltalk method leaves an object of some sort on the stack, even it’s just the receiver, but a WASM GC function may not.

A typical temporary variable pragma looks like this:

<local: #index type: #i32 attributes: #(mutable nullable)>

This corresponds to the semantics of local variables in a WASM GC function. Each one is declared with the “local” WASM GC instruction, and has a name, a type, and two attributes. A variable is mutable if its value can be changed after initialization, and nullable if its value can be null (a null of an appropriate type, of course). A variable can either have or not have each of those attributes.

WASM GC has a few built-in types, like “i32” for 32-bit integers. For reference (pointer) types, we write “(ref $type)”, where “type” is defined elsewhere in the source file. We define reference types for each kind of structure (“struct”) we’ll be using in the simulation, also by using method pragmas. Each class in the simulation has a “type” method which defines those pragmas. Just as there’s a format for variable type annotation pragmas, there’s another for struct type annotation, using “field:” instead of “local:”.

When we send “type” to a struct class, we get a type signature, a specialized dictionary created from those pragmas. A signature object is similar to a JavaScript object, in that we can send a message named for a desired field’s key, and the signature will answer the value it has for that key. Signatures are ordered dictionaries, since the order of variables in a function or fields in a struct type is crucial. Unlike JavaScript objects or Smalltalk dictionaries, signatures can have multiple values for the same key. This enables us to evaluate an expression like (VirtualMachine type field) to get an OrderedCollection of the fields in VirtualMachine’s type signature.

Signature objects are implemented by classes descended from PseudoJSObject, a Smalltalk implementation of JavaScript object semantics that inherits from OrderedDictionary. There are further specializations for different kinds of signatures, for functions, variables, and structs.

virtual machine generation in action

The preliminaries of the WASM GC source file for the virtual machine are written by the VirtualMachine itself. This includes all the reference type definitions, and definitions for functions and variables that Catalyst will import from and export to a webpage that’s running it. It also includes translations of the methods used to create the object memory. After interpretation begins, every time the VirtualMachine is about to interpret an instruction, it translates the methods run to interpret the previous instruction.

The VirtualMachine knows which methods have been run by using the method-marking facility I wrote about in “The Big Shake-Out”. When the system runs a compiled method, it sets a field in the method to an increasing counter value. In the object memory, one can ask the method for the value of that field, and one can set the field to zero. We can zero that field in every method at some initial time, run the system, and get a collection of every method that was run in the meantime, ordered by how recently they were run.

exemplar object memories

The object memory I’m running in the simulation is the same one I created in the initial handwritten version of the WASM GC virtual machine. It runs a simple polynomial benchmark method that could be optimized into a much more efficient WASM GC function by AI, and also demonstrates method caching and just-in-time compilation. We could develop formalisms for composing this object memory. Through the rigorous use of unit tests, we could produce object memories that exercise every instruction and primitive exhaustively, and act as minimal harnesses for benchmarks.

next: phase five, object memory snapshots

After the WASM GC virtual machine is free of handwritten WASM GC code, I’ll implement the ability to write and resume object memory snapshots. See you then!

AI-assisted just-in-time compilation in Catalyst

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Smalltalk, Spoon, SqueakJS with tags , , , , , , , , on 22 July 2025 by Craig Latta
AI can help us find the “desire paths” in stack instructions.

There’s a long tradition of just-in-time compilation of code in livecoding systems, from McCarthy’s LISP systems in the 1960s, to the dynamic method translation of the Deutsch-Schiffman Smalltalk virtual machine, to the “hotspot” compilers of Self, Strongtalk and Java, to current implementations for JavaScript and WebAssembly in Chrome’s V8 and Firefox’s SpiderMonkey. Rather than interpret sequences of virtual machine instructions, these systems translate instruction sequences into equivalent (and ideally more efficient) actions performed by the instructions of a physical processor, and run those instead.

We’d like to employ this technique with the WASM GC Catalyst Smalltalk virtual machine as well. Translating the instructions of a Smalltalk compiled method into WASM GC instructions is straightforward, and there are many optimizations we can specify ahead of time for optimizing those instructions. But with the current inferencing abilities of artificial intelligence large language models (LLMs), we can leave even that logic until runtime.

dynamic method translation by LLM

Since Catalyst runs as a WASM module orchestrated by SqueakJS in a web browser, and the web browser has JavaScript APIs for WASM interoperation, and there are JS APIs for interacting with LLMs, we can incorporate LLM inference in our translation of methods to WASM functions. We just need an expressive system for composing appropriate prompts. Using the same Epigram compilation framework that enables the decompilation of the Catalyst virtual machine itself into WASM GC, we can express method instructions in a prompt, by delegating that task to the reified instructions themselves.

For an example, let’s take the first method developed for Catalyst to execute, SmallInteger>>benchmark, a simple but sufficiently expensive benchmark. It repeats this pattern five times: add one to the receiver, multiply the result by two, add two to the result, multiply the result by three, add three to the result, and multiply the result by two. This is trivial to express as a sequence of stack operations, in both Smalltalk instructions and WASM instructions.

Our pre-written virtual machine code can do the simple translation between those instruction sets without using an LLM at all. With a little reasoning, an LLM can recognize from those instructions that something is being performed five times, and write a loop instead of inlining all the operations. With a little more reasoning, it can do a single-cycle analysis and discover the algebraic relationship between the receiver and the output (248,832n + 678,630). That enables it to write a much faster WASM function of five instructions instead of 62.

the future

This is a contrived example, of course, but it clearly shows the potential of LLM-assisted method translation, at least for mathematical operations. I’ve confirmed that it works in Catalyst, and used the results to populate a polymorphic inline cache of code to run instead of interpretation. Drawing inspiration from the Self implementation experience, what remains to be seen is how much time and money is appropriate to spend on the LLM. This can only become clear through real use cases, adapting to changing system conditions over time.

Catalyst update: a WASM GC Smalltalk virtual machine and object memory are running

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Naiad, Smalltalk, Spoon, SqueakJS with tags , , , , , , , , , on 29 June 2025 by Craig Latta
The core of the mechanism is working.

I’ve bootstrapped a Smalltalk virtual machine and object memory as a WebAssembly (WASM) module, using the type system that supports garbage collection there. I have two motivations for doing this: I’d like to see how fast it can run, and I’d like to see how it can interoperate with other WASM modules in diverse settings, including web browsers, servers, and native mainstream OS apps.

the current test: evaluating (3 squared)

The very first thing I ran was a method adding three and four, reporting the result through a webpage. This required types and classes for Object, SmallInteger, ByteArray, Dictionary, Class, CompiledMethod, Context, and Process, a selectors for #+, and functions for interpreting bytecodes, creating arrays, dictionaries, contexts, and the initial object memory, manipulating dictionaries, stacks, contexts, and for reporting results to JavaScript.

Evaluating (3 + 4) only uses an addition bytecode, instead of actually sending a message. After I got a successful result, I changed the expression to (3 squared). This tested sending an actual message, creating a context for the computation, invoking a method different from the one sending the message.

Using WASM’s JavaScript interoperation facilities, I export two WASM functions to JS for execution in a web browser: createMinimalBootstrap() and interpret(). The createMinimalBootstrap function creates classes, selectors, an unbound method that sends “squared” and an initial context for it, and a “squared” method installed in class SmallInteger, and initializes interpreter state.

With the interpreter and object memory set up, JS can tell the WASM module to start interpreting bytecodes, with interpret(). The initial unbound method, after the bytecodes for performing (3 squared), has a special bytecode for reporting the result to JS. It calls a JS function imported from the webpage, which simply prints the result in the webpage. The webpage also reports how long the interpreter takes to run; it might be interesting when measuring the speed of methods compiled to WASM functions.

the interpreter

the SqueakWASM virtual machine returning a result to JavaScript

The interpreter implements all the traditional Smalltalk bytecodes, and a few more for interoperating with JavaScript. Smalltalk objects are represented with WASM GC reference types: i31refs for SmallIntegers, and structrefs for all other objects. There is a type hierarchy mirroring the Smalltalk classes used in the system. With all objects implemented as reference types, rather than as byte sequences in linear WASM memory, we can leverage WASM’s garbage collector. This is similar to the way SqueakJS leverages the JS runtime engine’s garbage collector in a web browser. Also like SqueakJS, SmallIntegers larger than a signed 31-bit integer are boxed LargeIntegers, as WASM GC doesn’t yet have a built-in reference type for 63-bit signed integers.

the next test: just-in-time compilation of methods to WASM

Now that I can run SmallInteger>>squared as interpreted bytecodes, I’ll write a rudimentary translator from bytecode sequences to WASM functions. It may provide an interesting micro-benchmark for comparing execution speeds.

future work: reading snapshots and more

Obviously, a Smalltalk virtual machine does many things; this is a tiny but promising beginning. In the near future, I’d like to support reading existing Squeak, Pharo, and Cuis object memories, provide more extensive integration with device capabilities through JavaScript, web browsers, and WASI, and support the Sista instruction set for compatibility with the OpenSmalltalk Cog virtual machine. I’m especially interested to see how SqueakWASM might integrate with other WASM modules in the wild.

What would you do with WASM Smalltalk? Please let me know!

dynamic translation of Smalltalk to WebAssembly

Posted in Caffeine, consulting, Context, livecoding, Smalltalk, Spoon, SqueakJS with tags , , , , , , , on 26 July 2023 by Craig Latta
continuing with the DNA theme…

In Catalyst, a WebAssembly implementation of the OpenSmalltalk virtual machine, there are three linguistic levels in play: Smalltalk, JavaScript (JS), and WebAssembly (WASM). Smalltalk is our primary language, JS is the coordinating language of the hosting environment (a web browser), and WASM is a high-performance runtime instruction set to which we can compile any other language. In a previous article, I wrote about automatic translation of JS to WASM, as a temporary way of translating the SqueakJS virtual machine to WASM. That benefits from a proven JS starting point for the relatively large codebase of the virtual machine. When translating individual Smalltalk compiled methods for “just-in-time” optimization, however, it makes more sense to translate from Smalltalk to WASM directly.

compiled method transcription

We already have infrastructure for transcribing Smalltalk compiled methods, via class InstructionStream. We use it to print human-readable descriptions of method instructions, and to simulate their execution in the Smalltalk debugger. We can also use it to translate a method to human-readable WebAssembly Text (WAT) source code, suitable for translation to binary WASM code which the web browser can execute. Since the Smalltalk and WASM instruction sets are both stack-oriented, the task is straightforward.

I’ve created a subclass of InstructionStream, called WATCompiledMethodTranslator, which uses the classic scanner pattern to drive translation from Smalltalk instructions to WASM instructions. With accompanying WASM type information for Smalltalk virtual machine structures, we can make WASM modules that execute the instructions for individual Smalltalk methods.

the “hello world” of Smalltalk: 3 + 4

As an example, let’s take a look at translating the traditional first Smalltalk expression, 3 + 4. We’ll create a Smalltalk method in class HelloWASM from this source:

HelloWASM>>add
	"Add two numbers."

	^3 + 4

This gives us a compiled method with the following Smalltalk instructions. On each line below, we list the program counter value, the instruction, and a description of the instruction.

0: 0x20: push the literal constant at index 0 (3) onto the method's stack
1: 0x21: push the literal constant at index 1 (4) onto the method's stack
2: 0xB0: send the arithmetic message at index 0 (+)
3: 0x7C: return the top of the method's stack

A WATCompiledMethodTranslator uses an instance of InstructionStream as a scanner of the method, interpreting each Smalltalk instruction in turn. When interpreting an instruction, the scanner sends a corresponding message to the translator, which in turn writes a transcription of that instruction as WASM instructions, onto a stream of WAT source.

The first instruction in the method is “push the literal constant at index 0”. The scanner finds the indicated literal in the literal frame of the method (i.e., 3), and sends pushConstant: 3 to the translator. Here are the methods that the translator runs in response:

WATCompiledMethodTranslator>>pushConstant: value
	"Push value, a constant, onto the method's stack."

	self
		comment: 'push constant ', value printString;
		pushFrom: [value printWATFor: self]
WATCompiledMethodTranslator>>pushFrom: closure
     "Evaluate closure, which emits WASM instructions that push a value onto the WASM stack. Emit further WASM instructions that push that value onto the Smalltalk stack."

	self
		setElementAtIndexFrom: [
			self
				incrementField: #sp
				ofStructType: #vm
				named: #vm;
				getField: #sp
				ofStructType: #vm
				named: #vm]
		ofArrayType: #pointers
		named: #stack
		from: closure
WATCompiledMethodTranslator>>setElementAtIndexFrom: elementIndexClosure ofArrayType: arrayTypeName named: arrayName from: elementValueClosure
	"Evaluate elementIndexClosure to emit WASM instructions that leave an array index on the WASM stack. Evaluate elementValueClosure to emit WASM instructions that leave an array element value on the WASM stack. Emit further WASM instructions, setting the element with that index in an array of the given type and variable name to the value."

	self get: arrayName.
	{elementIndexClosure. elementValueClosure} do: [:each | each value].

	self
		indent;
		nextPutAll: 'array.set $';
		nextPutAll: arrayTypeName

In the final method above, we finally see a WASM instruction, array.set. The translator implements stream protocol for actually writing WAT text to a stream. The comment:, get:, and getField:ofStructType:named: methods are similar, using “;;” and the array.get and struct.get WASM instructions. The array and struct instructions are part of the WASM garbage collection extension, which introduces types.

WASM types for virtual machine structures

To actually use WASM instructions that make use of types, we need to define the types in our method’s WASM module. In pushFrom: above, we use a struct variable of type vm named vm, and an array variable of type pointers named stack. The vm variable holds global virtual machine state (for example, the currently executing method’s stack pointer), similar to the SqueakJS.vm variable in SqueakJS. The stack variable holds an array of Smalltalk object pointers, constituting the current method’s stack. In general, the WASM code for a Smalltalk method will also need fast variable access to the active Smalltalk context, the active context’s stack, the current method’s literals, and the current method’s temporary variables.

Our WASM module for HelloWASM>>add might begin like this:

(module
	(type $bytes (array (mut i8)))
 	(type $words (array (mut i32)))
 	(type $pointers (array (ref $object)))

 	(type $object (struct
		(field $metabits (mut i32))
		(field $class (ref $object))
		(field $format (mut i32))
		(field $hash (mut i32))
		(field $pointers (ref $pointers))
		(field $words (ref $words))
		(field $bytes (ref $bytes))
		(field $float (mut f32))
		(field $integer (mut i32))
		(field $address (mut i32))
		(field $nextObject (ref $object))))

	(global $vm (struct
		(field $sp (mut i32))
		(field $pc (mut i32)))

	(global $stack (array (ref $pointers)))

	(function $HelloWASM_add
		;; pc 0
		;; push constant 3
		global.get $stack
		global.get $vm
		global.get $vm
		struct.get $vm $sp
		i32.const 1
		i32.add
		struct.set $vm $sp ;; increment the stack pointer
		global.get $vm
		struct.get $vm $sp
		i32.const 3
		array.set $pointers
		
		;; pc 1
		...

As is typical with assembly-level code, there’s a lot of setup involved which seems quite verbose, but it enables fast paths for the execution machinery. We’re also effectively taking on the task of writing the firmware for our idealized Smalltalk processor, by setting up interfaces to contexts and methods, and by implementing the logic for each Smalltalk instruction. In a future article, I’ll discuss the mechanisms by which we actually run the WASM code for a Smalltalk method. I’ll also compare the performance of dynamic WASM translations of Smalltalk methods versus the dynamic JS translations that SqueakJS makes. I don’t expect the WASM translations to be much (or any) faster at the moment, but I do expect them to get faster over time, as the WASM engines in web browsers improve (just as JS engines have).

automated translation of JavaScript to WebAssembly for SqueakJS

Posted in Caffeine, Naiad, Smalltalk, Spoon, SqueakJS with tags , on 6 July 2023 by Craig Latta
creating WASM from JS is a bit like creating DNA from proteins

After creating a working proof-of-concept Squeak Smalltalk virtual machine with a combination of existing SqueakJS code and handwritten WASM (for the instruction functions), I set about automating the generation of WASM from JS for the rest of the functions. (A hybrid WASM/JS virtual machine has poor performance, because of the overhead of calling JS functions from WASM.) Although I expect eventually to write a JS parser with Epigram, for now I’m using the existing JS parser Esprima, via the JS bridge in SqueakJS. (The benefits of using Epigram here will be greatly improved debugging, portability to non-JS platforms, and retention of parsed comments.) After parsing the SqueakJS VM code into Smalltalk objects representing JS parse nodes, I’m using those objects’ WASM generation behavior to generate a WASM-only virtual machine. I’m taking advantage of the newly-added type management instructions added to WASM, as part of its garbage-collection proposal.

type hinting

To make effective use of those instructions, we need the JS code to give some hints about object structure. For example, the SqueakJS at-cache uses JS objects whose structure is emergent, rather than defined explicitly in advance. If SqueakJS were written in TypeScript, where all structures are defined in advance, we would have this information already. Instead, I add a prototype JS object to the at-cache object, describing the type of an at-cache entry:

this.atCacheEntryPrototype = {
			"array": [],
			"convertChars": true,
			"size": 0,
			"ivarOffset": 0}

this.atCachePrototype = [this.atCacheEntryPrototype]
this.atCache = []
this.atCache.prototype = this.atCachePrototype

When generating WASM source, an assignment parse node can check to see if its name ends with “Prototype”, and create type information instead of generating source. The actual JS code for setting a prototype does practically nothing at VM runtime, so has no impact on performance. Types are cached by the left-side node of an assignment expression, and by the outermost scope in a registry of all types. The types themselves are instances of a WASMType hierarchy. They can print WASM for themselves, and assist in printing the WASM for structs that use them.

Overall, I prefer to keep the SqueakJS implementation in JS rather than TypeScript, to keep the fully dynamic style. These prototype annotations are small and manageable.

further JIT optimization

After I’ve got WASM source for the complete virtual machine, I plan to turn my attention to the SqueakJS JIT. This translates Smalltalk compiled method instructions to JS code, which in turn is compiled to physical processor instructions by the JS execution engine. It may be that the web browser can generate more efficient native code from WASM we’ve generated from the generated JS code. It will be good to measure it.

a WebAssembly Squeak virtual machine is running

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Smalltalk, Spoon, SqueakJS with tags , , , , , , , , on 14 April 2023 by Craig Latta
the instructions are ticking!

I’ve replaced the inner instruction-dispatch loop of a running SqueakJS virtual machine with a handwritten WebAssembly (WASM) function, and run several thousand instructions of the Caffeine object memory. The WASM module doesn’t yet have its own memory. It’s using the same JavaScript objects that the old dispatch loop did, and the supporting JS state and functions (like Squeak.Interpreter class). I wrote a simple object proxy scheme, whereby WASM can use unique integer identifiers to refer to the Smalltalk objects.

Because of this indirection, the current performance is very slow. The creation of an object proxy is based on stable object pointer (OOP) values; young objects require full garbage collection to stabilize their OOPs. There is also significant overhead in calling JavaScript functions from WASM. At this stage, the performance penalties are worthwhile. We can verify that the hybrid JS/WASM interpreter is working, without having to write a full WASM implementation first.

a hybrid approach

My original approach was to recapitulate the Slang experience, by using Epigram to decompile the Smalltalk methods of a virtual machine to WASM. I realized, though, that it’s better to take advantage of the livecoding capacity of the SqueakJS VM. I can replace individual functions of the SqueakJS VM, maintaining a running system all the while. I can also switch those functions back and forth while the system is running, perhaps many millions of instructions into a Caffeine session. This will be invaluable for debugging.

The next goal is to duplicate the object memory in a WASM memory, and operate on it directly, rather than using the object proxy system. I’ll start by implementing the garbage collector, and testing that it produces correct results with an actual object memory, by comparing its behavior to that of the SqueakJS functions.

Minimal object memories will be useful in this process, because garbage collection is faster, and there is less work to do when resuming a snapshot.

performance improvement expected

From my experiment with decompiling a Smalltalk method for the Fibonacci algorithm into WASM, I saw that WASM improves the performance of send-heavy Smalltalk code by about 250 times. I was able to achieve significant speedups from the targeted use of WASM for the inner functions of BitBLT. From surveying performance comparisons between JS and WASM, I’m expecting a significant improvement for the interpreter, too.

Caffeine Web services through Deno

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Naiad, Smalltalk, Spoon, SqueakJS with tags , on 9 July 2022 by Craig Latta
Caffeine in a Deno worker can provide Web APIs to Smalltalk in a native app.

bridging native apps and the Web

We’ve been able to run Caffeine headlessly in a Web Worker for some time now, using NodeJS. I’ve updated this support to use the Deno JavaScript runtime instead of Node. This gives us better access to familiar Web APIs, and a cleaner module system without npm. I’ve also extended the bridging capability of the code that Deno runs. Now, a native Squeak app can start Deno (via class OSProcess), Deno starts Caffeine in a worker, and the two Smalltalk instances can communicate with each other via remote messaging.

I’m using this bridge to let native Squeak participate in WebRTC sessions with other Smalltalks, as part of the Naiad team development system. The same Squeak object memory runs in both the native Squeak and the Deno worker. I’m sure many other interesting use cases will arise, as we explore what native Squeak and Web Squeak can do together!

Naiad progress 2019-12-02: online team services

Posted in Caffeine, consulting, Context, livecoding, Naiad, Smalltalk, Spoon, SqueakJS with tags , , , , , , on 2 December 2019 by Craig Latta
Naiad keeps livecoders informed of their teammates activity, and remembers all history.

topology established

Naiad is Caffeine‘s live module system. The goal is to support live versioning of classes and methods as they are edited, from connected teams of developers using Smalltalk or JavaScript IDEs from web browsers and native apps. Naiad keeps each developer informed of events meaningful to their teams and work. It’s comparable to a mashup of GitHub and Slack, and will interoperate with them as well.

The current Naiad prototype uses a relay network of NodeJS servers, each with Caffeine running in a Web Worker thread, and each serving a set of Caffeine-based client IDEs, in web browsers and native apps. The workers keep track of class and method versions, system checkpoints, and teams, using the relays to broadcast events to clients. Clients can request various services of the workers, like joining teams and making checkpoints from object memory snapshots.

These two clients are connected to the same relay server. The client on the left created a new team, by sending a message to the relay’s worker. The worker created the team, and told the relay to notify all of its peers (clients and relays). For now, clients respond by inspecting the new team.

I’ve just made the first system checkpoint, and broadcast the first team event (the creation of a team). Eventually, Naiad will support events for several services, including team chatting and screen-sharing, history management, and application deployment. I’m still eager to hear what events and services you think you would want in a livecoding notification system; please let me know! I expect the first public release of this work to be part of the second 2019 solstice release, on 22 December.

a Web UIs update

Posted in Caffeine, consulting, Context, livecoding, Naiad, Smalltalk, Spoon, SqueakJS with tags , , , , , , on 12 September 2019 by Craig Latta
livecoded Vue versions of the Smalltalk devtools

I’ve created working Vue versions of the traditional Smalltalk workspace and classes browser, livecoded in the web browser from the full Squeak IDE. These use the vue-draggable-resizable component as the basis of a window system, for dragging and resizing, and the vue-menu component for pop-up context menus. Third-party Vue components are loaded live from the network using http-vue-loader, avoiding all offline build steps (e.g., with webpack). Each Smalltalk devtool UI is expressed as a Vue “single-file component” and loaded live.

When enough of the Smalltalk devtools are available in this format, I can provide an initial Squeak object memory snapshot without the UI process and its supporting code, and without the relatively large bitmaps for the Display, drop-shadows, and fonts. This snapshot will be about two megabytes, down from the 35-megabyte original. (I also unloaded lots of other code in The Big Shakeout, including Etoys and Monticello). This will greatly improve Caffeine’s initial-page-load and snapshot times.

I’m also eager to develop other apps, like a proper GUI for the Chrome devtools, a better web browser tabs manager, and several end-user apps. Caffeine is becoming an interesting platform!

The Big Shake-Out

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Naiad, Smalltalk, Spoon, SqueakJS with tags , , , , , , , , , , , , , , , , , on 25 March 2019 by Craig Latta

Golden Retriever shaking off water

Some of those methods were there for a very long time!

I have adapted the minimization technique from the Naiad module system to Caffeine, my integration of OpenSmalltalk with the Web and Node platforms. Now, from a client Squeak, Pharo, or Cuis system in a web browser, I can make an EditHistory connection to a history server Smalltalk system, remove via garbage collection every method not run since the client was started, and imprint needed methods from the server as the client continues to run.

This is a garbage collection technique that I had previously called “Dissolve”, but I think the details are easier to explain with a different metaphor: “shaking” loose and removing everything which isn’t attached to the system through usage. This is a form of dynamic dead code elimination. The technique has two phases: “fusing” methods that must not be removed, and “shaking” loose all the others, removing them. This has a cascading effect, as the literals of removed methods without additional references are also removed, and further objects without references are removed as well.

After unfused methods and their associated objects are removed, the subsystems that provided them are effectively unloaded. For the system to use that functionality again, the methods must be reloaded. This is possible using the Naiad module system. By connecting a client system to a history server before shaking, the client can reload missing methods from the server as they are needed. For example, if the Morphic UI subsystem is shaken away, and the user then attempts to use the UI, the parts of Morphic needed by the user’s interactions are reloaded as needed.

This technology is useful for delineating subsystems that were created without regard to modularity, and creating deployable modules for them. It’s also useful for creating minimal systems suited to a specific purpose. You can fuse all the methods run by the unit tests for an app, and shake away all the others, while retaining the ability to debug and extend the system.

how it works

Whether a method is fused or not is part of the state of the virtual machine running the system, and is reset when the virtual machine starts. On system resumption, no method is fused. Each method can be told to fuse itself manually, through a primitive interface. Otherwise, methods are fused by the virtual machine as they are run. A class called Shaker knows which methods in a typical system are essential for operation. A Shaker instance can ensure those methods are fused, then shake the system.

Shaking itself invokes a variant of the normal OpenSmalltalk garbage collector. It replaces each unfused method with a special method which, when run, knows how to install the original method from a connected history server. In effect, all unfused methods are replaced by a single method.

Reinstallation of a method uses Naiad behavior history metadata, obtained by remote messaging with a history server, to reconstruct the method and put it in the proper method dictionary. The process creates any necessary prerequisites, such as classes and shared pools. No compiler is needed, because methods are constructed from previously-generated instructions; source code is merely an optional annotation.

the benefits of livecoding all the way down

I developed the virtual machine support for this feature with Bert Freudenberg‘s SqueakJS virtual machine, making heavy use of the JavaScript debugger in a web browser. I was struck by how much faster this sort of work is with a completely livecoded environment, rather than the C-based environment in which we usually develop the virtual machine. It’s similar to the power of Squeak’s virtual machine simulator. The tools, living in JavaScript, aren’t as powerful as Smalltalk-based ones, but they operate on the final Squeak virtual machine, rather than a simulation that runs much more slowly. Rebuilding the virtual machine amounts to reloading the web page in which it runs, and takes a few seconds, rather than the ordeal of a C-based build.

Much of the work here involved trial and error. How does Shaker know which methods are essential for system operation? I found out directly, by seeing where the system broke after being shaken. One can deduce some of the answer; for example, it’s obvious that the methods used by method contexts of current processes should be fused. Most of the essential methods yet to run, however, are not obvious. It was only because I had an interactive virtual machine development environment that it was feasible to restart the system and modify the virtual machine as many times as I needed (many, many times!), in a reasonable timeframe. Being able to tweak the virtual machine in real time from Smalltalk was also indispensable for debugging and feature development.

I want to thank Bert again for his work on SqueakJS. Also, many thanks to Dan Ingalls and the rest of the Lively team for creating the environment in which SqueakJS was originally built.

release schedule

I’m preparing Shaker for the next seasonal release of Caffeine, on the first 2019 solstice, 21 June 2019. I’ll make the virtual machine changes available for all OpenSmalltalk host platforms, in addition to the Web and Node platforms that Caffeine uses via the SqueakJS virtual machine. There may be alpha and beta releases before then.

If this technology sounds interesting to you, please let me know. I’m interested in use cases for testing. Thanks!