Archive for the Context Category

dynamic translation of Smalltalk to WebAssembly

Posted in Caffeine, consulting, Context, livecoding, Smalltalk, Spoon, SqueakJS with tags , , , , , , , on 26 July 2023 by Craig Latta
continuing with the DNA theme…

In Catalyst, a WebAssembly implementation of the OpenSmalltalk virtual machine, there are three linguistic levels in play: Smalltalk, JavaScript (JS), and WebAssembly (WASM). Smalltalk is our primary language, JS is the coordinating language of the hosting environment (a web browser), and WASM is a high-performance runtime instruction set to which we can compile any other language. In a previous article, I wrote about automatic translation of JS to WASM, as a temporary way of translating the SqueakJS virtual machine to WASM. That benefits from a proven JS starting point for the relatively large codebase of the virtual machine. When translating individual Smalltalk compiled methods for “just-in-time” optimization, however, it makes more sense to translate from Smalltalk to WASM directly.

compiled method transcription

We already have infrastructure for transcribing Smalltalk compiled methods, via class InstructionStream. We use it to print human-readable descriptions of method instructions, and to simulate their execution in the Smalltalk debugger. We can also use it to translate a method to human-readable WebAssembly Text (WAT) source code, suitable for translation to binary WASM code which the web browser can execute. Since the Smalltalk and WASM instruction sets are both stack-oriented, the task is straightforward.

I’ve created a subclass of InstructionStream, called WATCompiledMethodTranslator, which uses the classic scanner pattern to drive translation from Smalltalk instructions to WASM instructions. With accompanying WASM type information for Smalltalk virtual machine structures, we can make WASM modules that execute the instructions for individual Smalltalk methods.

the “hello world” of Smalltalk: 3 + 4

As an example, let’s take a look at translating the traditional first Smalltalk expression, 3 + 4. We’ll create a Smalltalk method in class HelloWASM from this source:

HelloWASM>>add
	"Add two numbers."

	^3 + 4

This gives us a compiled method with the following Smalltalk instructions. On each line below, we list the program counter value, the instruction, and a description of the instruction.

0: 0x20: push the literal constant at index 0 (3) onto the method's stack
1: 0x21: push the literal constant at index 1 (4) onto the method's stack
2: 0xB0: send the arithmetic message at index 0 (+)
3: 0x7C: return the top of the method's stack

A WATCompiledMethodTranslator uses an instance of InstructionStream as a scanner of the method, interpreting each Smalltalk instruction in turn. When interpreting an instruction, the scanner sends a corresponding message to the translator, which in turn writes a transcription of that instruction as WASM instructions, onto a stream of WAT source.

The first instruction in the method is “push the literal constant at index 0”. The scanner finds the indicated literal in the literal frame of the method (i.e., 3), and sends pushConstant: 3 to the translator. Here are the methods that the translator runs in response:

WATCompiledMethodTranslator>>pushConstant: value
	"Push value, a constant, onto the method's stack."

	self
		comment: 'push constant ', value printString;
		pushFrom: [value printWATFor: self]
WATCompiledMethodTranslator>>pushFrom: closure
     "Evaluate closure, which emits WASM instructions that push a value onto the WASM stack. Emit further WASM instructions that push that value onto the Smalltalk stack."

	self
		setElementAtIndexFrom: [
			self
				incrementField: #sp
				ofStructType: #vm
				named: #vm;
				getField: #sp
				ofStructType: #vm
				named: #vm]
		ofArrayType: #pointers
		named: #stack
		from: closure
WATCompiledMethodTranslator>>setElementAtIndexFrom: elementIndexClosure ofArrayType: arrayTypeName named: arrayName from: elementValueClosure
	"Evaluate elementIndexClosure to emit WASM instructions that leave an array index on the WASM stack. Evaluate elementValueClosure to emit WASM instructions that leave an array element value on the WASM stack. Emit further WASM instructions, setting the element with that index in an array of the given type and variable name to the value."

	self get: arrayName.
	{elementIndexClosure. elementValueClosure} do: [:each | each value].

	self
		indent;
		nextPutAll: 'array.set $';
		nextPutAll: arrayTypeName

In the final method above, we finally see a WASM instruction, array.set. The translator implements stream protocol for actually writing WAT text to a stream. The comment:, get:, and getField:ofStructType:named: methods are similar, using “;;” and the array.get and struct.get WASM instructions. The array and struct instructions are part of the WASM garbage collection extension, which introduces types.

WASM types for virtual machine structures

To actually use WASM instructions that make use of types, we need to define the types in our method’s WASM module. In pushFrom: above, we use a struct variable of type vm named vm, and an array variable of type pointers named stack. The vm variable holds global virtual machine state (for example, the currently executing method’s stack pointer), similar to the SqueakJS.vm variable in SqueakJS. The stack variable holds an array of Smalltalk object pointers, constituting the current method’s stack. In general, the WASM code for a Smalltalk method will also need fast variable access to the active Smalltalk context, the active context’s stack, the current method’s literals, and the current method’s temporary variables.

Our WASM module for HelloWASM>>add might begin like this:

(module
	(type $bytes (array (mut i8)))
 	(type $words (array (mut i32)))
 	(type $pointers (array (ref $object)))

 	(type $object (struct
		(field $metabits (mut i32))
		(field $class (ref $object))
		(field $format (mut i32))
		(field $hash (mut i32))
		(field $pointers (ref $pointers))
		(field $words (ref $words))
		(field $bytes (ref $bytes))
		(field $float (mut f32))
		(field $integer (mut i32))
		(field $address (mut i32))
		(field $nextObject (ref $object))))

	(global $vm (struct
		(field $sp (mut i32))
		(field $pc (mut i32)))

	(global $stack (array (ref $pointers)))

	(function $HelloWASM_add
		;; pc 0
		;; push constant 3
		global.get $stack
		global.get $vm
		global.get $vm
		struct.get $vm $sp
		i32.const 1
		i32.add
		struct.set $vm $sp ;; increment the stack pointer
		global.get $vm
		struct.get $vm $sp
		i32.const 3
		array.set $pointers
		
		;; pc 1
		...

As is typical with assembly-level code, there’s a lot of setup involved which seems quite verbose, but it enables fast paths for the execution machinery. We’re also effectively taking on the task of writing the firmware for our idealized Smalltalk processor, by setting up interfaces to contexts and methods, and by implementing the logic for each Smalltalk instruction. In a future article, I’ll discuss the mechanisms by which we actually run the WASM code for a Smalltalk method. I’ll also compare the performance of dynamic WASM translations of Smalltalk methods versus the dynamic JS translations that SqueakJS makes. I don’t expect the WASM translations to be much (or any) faster at the moment, but I do expect them to get faster over time, as the WASM engines in web browsers improve (just as JS engines have).

Catalyst, a WebAssembly-enabled version of SqueakJS

Posted in Appsterdam, consulting, Context, livecoding with tags , , , , , , , on 8 May 2023 by Craig Latta
Catalyst uses a synchronized linear representation of object memory to coordinate JavaScript and WebAssembly

It’s straightforward to apply WebAssembly (WASM) to isolated JavaScript hotspots, where there are no side-effects. This enables us to speed up sections of the SqueakJS primitives, like BitBLT, which perform pure functions. The SqueakJS interpreter code, on the other hand, is rife with side-effects. The act of interpretation modifies the deep graph of connected structures which is the Smalltalk object memory.

In my first WASM-enabled SqueakJS virtual machine, I wrote a WASM version of the JS function that interprets a single Smalltalk instruction, interpretOneWASM. To perform the necessary interactions with the object memory, that WASM function called the original JS support functions. That approach yielded a working virtual machine, but with a high performance penalty. Since JS and WASM can use shared memory, I’m curious to see if we can eliminate the JS function calls in interpretOneWASM by using a synchronized linear representation of object memory, and if this would speed up the virtual machine.

I’ve modified SqueakJS so that it maintains a shared WASM memory buffer, into which it writes and updates serializations of the objects in object memory. Now, interpretOneWASM can read and write the object information it needs without having to make most of the previous JS calls. At the moment, some primitives (like garbage collection) are still JS calls. Since they were relatively long-running operations already, I don’t expect performance to suffer much.

The format of the shared linearized object memory is nearly identical to that of an object memory snapshot, with two differences. The first is that only one object header word is written in all cases; WASM doesn’t need the additional object header information supplied in a snapshot for re-creating objects, since it doesn’t do that. The second difference is that young objects, with negative addresses, are also represented, in a special memory segment above the tenured objects. This scheme allows WASM to access all objects by address.

When WASM modifies the object memory, it records the addresses of the modified objects in a special object. The JS side uses this information to updates the canonical JS object memory representation. If at some point we implement all of the JS functions in WASM, we’ll be able to make the linear representation the only one. In the meantime, we have a framework for using both JS and WASM to their strengths, and transitioning from JS to WASM gradually.

The user can choose dynamically whether the JS or the WASM version of the interpretOne function is in use. At the moment, we let JS read the object memory snapshot and get the system started, then switch to WASM after the initial context is loaded and running. JS also writes the entire linearized object memory into the shared WASM buffer. The user can switch interpretOne from JS to WASM by evaluating a Smalltalk expression that invokes a JS function of the interpreter (“JS display vm useWASM”).

It’ll be interesting to see if this scheme yields an overall increase in system speed. Hopefully, with this gradual transition, we’ll be able to tell if a full conversion of SqueakJS from JS to WASM is worthwhile, before investing a lot of effort into rewriting the JS functions.

a WebAssembly Squeak virtual machine is running

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Smalltalk, Spoon, SqueakJS with tags , , , , , , , , on 14 April 2023 by Craig Latta
the instructions are ticking!

I’ve replaced the inner instruction-dispatch loop of a running SqueakJS virtual machine with a handwritten WebAssembly (WASM) function, and run several thousand instructions of the Caffeine object memory. The WASM module doesn’t yet have its own memory. It’s using the same JavaScript objects that the old dispatch loop did, and the supporting JS state and functions (like Squeak.Interpreter class). I wrote a simple object proxy scheme, whereby WASM can use unique integer identifiers to refer to the Smalltalk objects.

Because of this indirection, the current performance is very slow. The creation of an object proxy is based on stable object pointer (OOP) values; young objects require full garbage collection to stabilize their OOPs. There is also significant overhead in calling JavaScript functions from WASM. At this stage, the performance penalties are worthwhile. We can verify that the hybrid JS/WASM interpreter is working, without having to write a full WASM implementation first.

a hybrid approach

My original approach was to recapitulate the Slang experience, by using Epigram to decompile the Smalltalk methods of a virtual machine to WASM. I realized, though, that it’s better to take advantage of the livecoding capacity of the SqueakJS VM. I can replace individual functions of the SqueakJS VM, maintaining a running system all the while. I can also switch those functions back and forth while the system is running, perhaps many millions of instructions into a Caffeine session. This will be invaluable for debugging.

The next goal is to duplicate the object memory in a WASM memory, and operate on it directly, rather than using the object proxy system. I’ll start by implementing the garbage collector, and testing that it produces correct results with an actual object memory, by comparing its behavior to that of the SqueakJS functions.

Minimal object memories will be useful in this process, because garbage collection is faster, and there is less work to do when resuming a snapshot.

performance improvement expected

From my experiment with decompiling a Smalltalk method for the Fibonacci algorithm into WASM, I saw that WASM improves the performance of send-heavy Smalltalk code by about 250 times. I was able to achieve significant speedups from the targeted use of WASM for the inner functions of BitBLT. From surveying performance comparisons between JS and WASM, I’m expecting a significant improvement for the interpreter, too.

tactical Squeak speedups with WebAssembly

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Smalltalk, SqueakJS with tags , , , , , , , on 7 April 2023 by Craig Latta

With the JavaScript bridge in SqueakJS, we can utilize built-in web browser behavior and other JS frameworks from Smalltalk, just as any other JS code would. I’ve used it to build Caffeine apps using A-Frame and croquet.io. Another useful framework we can integrate is WebAssembly (WASM), a stack-oriented instruction set for writing high-performance code. I have begun to identify performance-critical code in the SqueakJS virtual machine, and replace it with WASM code. The initial results are encouraging and useful.

identifying hotspots

I’m running SqueakJS in the Chrome web browser. To identify virtual machine code that consumes the most time, I profile use cases that seem slow, using Chrome’s built-in devtools. The first use case I chose was drag-selecting a large quantity of text in a workspace.

a performance capture of drag-selecting text, indicating that rgbMulwith() is the most time-consuming inner BitBLT function

From reviewing a performance capture of this use case, we can see that rgbMulwith() is the most time-consuming inner function from the BitBLT plugin. While it doesn’t modify variables in outer scopes, it does read a plugin-global variable. Most of the work it does, however, is done by partitionedMulwithnBitsnPartitions(), a pure function returning the result of a mathematical operation on the inputs, without any other system state interaction. That makes it well-suited to WASM implementation.

While there are APIs for coordinating side effects with JavaScript, they are relatively slow. It is therefore harder to rationalize WASM implementations of individual higher-level Squeak virtual machine primitives, since they interact extensively with complex JavaScript objects like the Squeak interpreter. Eventually, we’ll represent the entire Squeak object memory inside a WASM memory, and implement the entire Squeak virtual machine with WASM functions. WASM garbage collection will assist the Squeak garbage collector, much as the JavaScript garbage collector assists the SqueakJS VM now. JavaScript interaction will be limited to the WASM implementation of the SqueakJS JS bridge.

translating from JS to WASM

Here’s the existing JS implementation of partitionedMulwithnBitsnPartitions():

With its stack-based instructions, WASM code is reminiscent of Smalltalk bytecode. Here’s some of the equivalent WASM implementation of the above function, written by hand. The WASM memory holds the maskTable from BitBitPlugin.js.

a section of the equivalent WASM

Note that WASM’s shift-left and shift-right instructions are fine as is; we don’t need to make wrapper functions for them as we did in JS.

After I modified the BitBLT plugin so that rgbMulwith() uses partitionedMUL(), drag-selecting text in the Caffeine user interface was much more responsive, and a different inner BitBLT plugin function was the most time-consuming. Even though rgbMulwith() used a small percentage of total time in the first performance capture, every saved millisecond significantly improves perceived animation smoothness. By using additional use cases (scrolling long lists, and repainting by alternating the stacking order of two windows), I identified other inner BitBLT plugin functions to optimize. The Caffeine user interface is now much more responsive than it was. This is especially useful with Worldly, the spatial IDE I’m building with Caffeine and A-Frame, where every bit of performance matters.

an alternative to writing WASM by hand

For the JS code in the Squeak virtual machine, it makes sense to write replacement WASM code by hand. Since WASM code is so similar to Smalltalk bytecode, for Smalltalk compiled methods it makes more sense to use automated decompilation to WASM. I have done this for a small proof-of-concept, using a Smalltalk compiled method for the Fibonacci algorithm.

Using the Smalltalk compiler and decompiler I wrote with my Epigram parsing framework, I was able to decompile the Smalltalk compiled method for the Fibonacci algorithm into WASM text. I then used an in-browser version of the WebAssembly Binary Toolkit from Caffeine to generate binary WASM, compile it in the current page as a function, and call the function. Comparing the execution time of finding the 29th Fibonacci number in both Smalltalk and WASM showed that WASM had 250 times the execution speed of the normal SqueakJS bytecode-to-JS translator.

I plan to write, in Smalltalk, a version of the Squeak virtual machine simulator that stores all objects in a WASM memory. Once it can evaluate (3 + 4), I’ll translate all its Smalltalk compiled methods to WASM, and see how much faster it runs. The next step will be to get a JS bridge working, and implement interfaces to the web browser DOM for graphics and user input event handling. Ultimately, a WASM implementation of the Squeak virtual machine may be preferable to the SqueakJS virtual machine.

Caffeine Web services through Deno

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Naiad, Smalltalk, Spoon, SqueakJS with tags , on 9 July 2022 by Craig Latta
Caffeine in a Deno worker can provide Web APIs to Smalltalk in a native app.

bridging native apps and the Web

We’ve been able to run Caffeine headlessly in a Web Worker for some time now, using NodeJS. I’ve updated this support to use the Deno JavaScript runtime instead of Node. This gives us better access to familiar Web APIs, and a cleaner module system without npm. I’ve also extended the bridging capability of the code that Deno runs. Now, a native Squeak app can start Deno (via class OSProcess), Deno starts Caffeine in a worker, and the two Smalltalk instances can communicate with each other via remote messaging.

I’m using this bridge to let native Squeak participate in WebRTC sessions with other Smalltalks, as part of the Naiad team development system. The same Squeak object memory runs in both the native Squeak and the Deno worker. I’m sure many other interesting use cases will arise, as we explore what native Squeak and Web Squeak can do together!

Epigram: reifying grammar production rules for clearer parsing, compiling, and searching

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Smalltalk, SqueakJS, Uncategorized with tags , , , , , , on 28 June 2022 by Craig Latta
a section of the Smalltalk grammar

putting production rules to work

In a traditional EBNF grammar, production rules describe all the allowed relationships between a language’s terminal symbols. By expressing them as live objects with behavior, they can parse and compile as well. They form a definitive reference network in which to record parsed terminals, making them ideally suited as parse trees. Individual rules also function as search terms in other rules which use them. Epigram is a framework for doing this. Let’s explore these features with an example.

We’ll use the grammar for Smalltalk methods. The production rules are included in the book Smalltalk-80: The Language and Its Implementation by Adele Goldberg and Dave Robson. They are depicted visually, with railroad diagrams (a few of them are shown above).

Each diagram shows a path going through one or more symbols. An EBNF production rule, or grammar symbol, is indicated by the name of the rule in a box. A terminal symbol is indicated by a circle with the symbol inside. An alternation is indicated by a path’s divergence through multiple symbols, converging afterward. A compound rule is indicated by a path going directly through multiple symbols. A repetition is indicated by a loop through a sequence of symbols, representing one or more occurrences of that sequence. EBNF also supports the option, which is no or one occurrence of a symbol, and the difference, which matches one rule but not another. These kinds of rules are sufficient for the Smalltalk grammar. There are other grammars, like XML, that extend BNF further, but we won’t discuss them here.

production rules as code

We can express these diagrams as code. For a terminal symbol, we can use a literal string. For an alternation, we can use a “|” (“or”) operator. For a compound rule, we can use a “||” (“then”) operator (after changing the Smalltalk compiler so that it doesn’t confuse “||” with “|”). For repetitions and options, we can use the unary messages “repetition” and “option”. We can store entire production rules as shared variables (pool variables in Squeak).

For example, we can write the first diagram as:

Digit := '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'.

Digit is an instance of class Alternation, and can be a variable in a SmalltalkProductionRules pool. We can write the second diagram as:

Digits := Digit repetition.

Digits is an instance of class Repetition. A rule which uses a compound rule is:

SymbolConstant := '#' || Symbol.

We can write each rule in this way, culminating with Method.

parsing

Once we’ve created all the rules for our grammar, we can ask the topmost rule, Method, to parse the source code of a method. To parse, a rule creates a stream on the proposed content, and attempts to accept the next character in the stream until the stream is empty. For example, a terminal symbol for ‘3’ will accept the next character if it is $3.

A symbol which consists of other symbols will delegate parsing to those symbols. An alternation between the terminal symbols for ‘3’ and ‘4’ will accept the next character if it is $3 or $4, but it decides this by delegating the parse to each of those symbols, and noting which of them was able to accept the next character. A symbol’s parse succeeds if it is able to accept enough characters to match every character in its string, if it’s a terminal symbol, or a sufficient set of subsymbols, if it’s a compound rule, alternation or repetition.

If a symbol doesn’t succeed, it fails and resets the stream’s position as it was before parsing began. Control is returned to the delegating symbol. This is called backtracking. If the overall parse backtracks all the way to the topmost rule without having emptied the stream, and the next character is unacceptable, then the entire parse fails and the content is ungrammatical. Having reached this point, however, we have information about which rules failed and how far the parse got in the stream. This is useful information to present to the user, with an exception.

The complexity of a grammar can make backtracking very expensive in time; reducing this cost is the main challenge in Epigram development currently. Informed choices of alternation orders in a grammar (as with a parsing expression grammar) and primitives (described below) yield dramatic performance increases.

compilation

If a parse is successful. We are left with a graph of successful production rules, each with a record of the characters it accepted, and its successful constituent symbols. We can use this graph as we would have used a traditional parse tree. Compilers can use the parse graph to create objects representing the source content in a useful structure. For example, we can create a CompiledMethod of Smalltalk virtual machine instructions, embodying the behavior specified by the source code.

For example, if our source code were:

The successful rules in our parse, in chronological order, would be:

  • Letter ($a)
  • Letter ($d)
  • Letter ($d) — further Letter successes are elided.
  • Identifier (‘add’)
  • UnarySelector (‘add’)
  • MessagePattern (‘add’)
  • SpecialCharacter (carriage return)
  • SpecialCharacter (tab)
  • Comment (‘”Add two numbers and answer the result.”‘)
  • Digit ($3)
  • Number (‘3’)
  • Literal (‘3’)
  • SpecialCharacter ($+)
  • BinarySelector (‘+’)
  • Literal (‘4’)
  • BinaryExpression (‘3 + 4’)
  • MessageExpression (‘3 + 4’)
  • Expression (‘3 + 4’)
  • Statements (‘3 + 4’)
  • Method (‘add “…” ^3 + 4’)

To get the intended method selector (#add), a compiler holding this parse history can simply ask the Method rule for its MessagePattern. The compiler can also ask the Expression to generate the Smalltalk stack machine instructions that carry it out.

searching

Since MessagePattern is a well-known shared variable in the SmalltalkProductionRules pool, the compiler can use it as a search term in queries to Method:

selector := (Method at: MessagePattern) terminals

Using production rules as search terms is a very useful way of navigating the grammatical structure of the parse tree, allowing the compiler writer to apply their knowledge of the grammar. Rather than focusing on how parsing works, or how to manipulate a parse tree which is separate from the grammar, one may express compilation entirely with the grammar’s rules.

performance optimization: primitives

It’s very convenient and clear to express a grammar as EBNF rules, but it can lead to alternations between many options, with expensive parsing behavior. Since the grammar keeps a complete history of the accepted rules for a parse, we can easily see which rules are most popular and consume the most time. For these rules, we can specify Smalltalk code equivalent to their parsing work, providing primitives. For XML, which has frequently-used alternations between thousands of Unicode characters, primitives provide speedups of 200 times or more.

enforcing constraints

Some grammars specify additional constraints on parsed content. For example, the HTML grammar requires an element’s opening and closing tags to match. Epigram supports adding constraints to production rules, in the form of block closures which must evaluate to true after parsing has taken place.

resolving ambiguities

Some grammars include points of intentional ambiguity. In Smalltalk, for example, there’s a grammatical ambiguity between chains of unary and binary messages. Epigram supports noting ambiguities, and resolving them through constraints. In the Smalltalk example, the ambiguity is resolved through constraint considering the scope in which parsing occurs. Which variable names are currently bound, and which unary and binary messages are actually defined, lead to a single interpretation.

decompilation

Writing a Smalltalk decompiler with reified production rules is also easier. The rule for a method declaration can dispatch decompilation for each bytecode to the corresponding instruction class, resulting in a set of equivalent instruction instances. An instruction which pops the virtual machine stack corresponds to a Smalltalk statement, and it can construct a structure of production rules equivalent to that statement, as if created from a parse. The rule structure can answer terminal symbols which are the equivalent source code. I’m writing an extended example of this decompilation process, as an Observable active essay with a live Caffeine session embedded inside it.

special thanks

Special thanks to Chris Thorgrimsson and Lam Research, for supporting this open-source work through commercial use cases.

Beatshifting: playing music in sync and out of phase

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, music, Smalltalk, SqueakJS with tags , , , , , , , on 27 April 2021 by Craig Latta
two Beatshifting timelines

I’ve written a Caffeine app implementation of the Beatshifting algorithm, for collaborative remote music performance that is synchronized and out-of-phase. Beatshifting uses network latency as a rhythmic element, using offsets from beats as timestamps, with a shared metronome and score.

I was inspired to write the Beatshifting app by NINJAM, a similar system that has hosted many hours of joyous sessions. There are a few interesting twists I think I can bring to the technology, through late-binding of audio rendering.

NINJAM also synchronizes distributed streams of rhythmic music. It works by using a server to collect an entire measure of audio from the performers’ timestamped streams, stamps them all with an upcoming measure number, and sends them back to each performer. Each performer’s system plays the collected measures with the start times aligned. In effect, each performer plays along with what everyone else did a measure ago. Each performer must receive audio only by the start of the upcoming measure, rather than fast enough to create the illusion of simultaneity.

Beatshifting gives more control over the session to each performer, and to an audience as well. Each performer can modify not only the local volume levels of the other performers, but also their delays and instruments. Each performer can also change the tempo and time signature of the session. A session can have an audience as well, and each audience member is really a performer who hasn’t played anything yet.

It’s straightforward to have an arbitrary number of participants in a session because Beatshifting takes the form of a web app. Each participant only needs to visit a session link in a web browser, rather than use a special digital audio workstation (DAW) app. By default, Beatshifting uses MIDI event messages instead of audio, using much less bandwidth even with a large group.

To deliver events to each participant’s web browser, Beatshifting uses the Croquet replication service. Croquet is able to replicate and synchronize any JavaScript object in every participant’s web browser, up to 60 times per second. Beatshifting uses this to provide a shared score. Music events like notes and fader movements can be scheduled into the score by any participant, and from code run by the score itself.

One piece of code the score runs broadcasts events indicating that measures have elapsed, so that the web browsers can render metronome clicks. There are three kinds of metronome clicks, for ticks, beats, and measures. For example, with a time signature of 6/8, there are two beats per measure, and three ticks per beat. Each tick is an eighth-note, so each beat is a dotted-quarter note. The sequence of clicks one hears is:

  • measure
  • tick
  • tick
  • beat
  • tick
  • tick

At a tempo of 120 beats per minute, or 240 clicks per 60,000 milliseconds, there are 250 milliseconds between clicks. Each time a web browser receives a measure-elapsed event, it schedules MIDI events for the next measure’s clicks with the local MIDI output interface. Since each web browser knows the starting time of the session in its output MIDI interface’s timescale, it can calculate the timestamps of all ensuing clicks.

When a performer plays a note, their web browser notes the offset in milliseconds between when the note was played and the time of the most recent click. The web browser then publishes an event-scheduling message, to which the score is subscribed. The score then broadcasts a note-played event to all the web browsers. Again, it’s up to each web browser to schedule a corresponding MIDI note with its local MIDI output interface. The local timestamp of that note is chosen to be the same millisecond offset from some future click point. How far in the future that click is can be chosen based on who played the note, or any other element of the event’s data. Each web browser can also choose other parameters for each event, like instrument, volume level, and panning position.

Quantities like tempo are part of the score’s state, and can be changed by any performer or audience member. Croquet ensures that the changed JavaScript variables are synchronized in all the participants’ web browsers.

With so many decisions about how music events are rendered left to each web browser, the mix that each participant hears can be wildly different. The only constants are the millisecond beat offsets of each performer’s notes. I think it’ll be fun to compare recordings of these mixes after the fact, and to make new ones from individual recorded tracks.

There’s no server that any participant needs to set up, and the Croquet service knows nothing of the Beatshifting protocol. This makes it very easy to start and join new sessions.

next steps

The current Beatshifting UI has controls for joining a session, enabling the local scheduling of metronome clicks, and changing the tempo and time signature of a session.

the current Beatshifting UI

If one is using a MIDI output interface connected to a DAW, then one may use the DAW to control instruments, volume, panning, and so on. I’d also like to provide the option of all MIDI event rendering performed by the web browser, and a UI for controlling and recording that. I’ve established the use of the ToneJS audio framework for rendering events, and am now developing the UI.

I led a debut performance of Beatshifting as part of the Netherlands Coding Live concert series, on 23 April 2021.

I’ve written an animated 3D visualization of the Beatshifting algorithm, which can be driven from live session data. This movie is an annotated slow-motion version:

visualizing the Beatshifting algorithm

I’m excited about the creative potential of Beatshifting sessions. Please contact me if you’re interested in playing or coding for this medium!

Naiad progress 2019-12-02: online team services

Posted in Caffeine, consulting, Context, livecoding, Naiad, Smalltalk, Spoon, SqueakJS with tags , , , , , , on 2 December 2019 by Craig Latta
Naiad keeps livecoders informed of their teammates activity, and remembers all history.

topology established

Naiad is Caffeine‘s live module system. The goal is to support live versioning of classes and methods as they are edited, from connected teams of developers using Smalltalk or JavaScript IDEs from web browsers and native apps. Naiad keeps each developer informed of events meaningful to their teams and work. It’s comparable to a mashup of GitHub and Slack, and will interoperate with them as well.

The current Naiad prototype uses a relay network of NodeJS servers, each with Caffeine running in a Web Worker thread, and each serving a set of Caffeine-based client IDEs, in web browsers and native apps. The workers keep track of class and method versions, system checkpoints, and teams, using the relays to broadcast events to clients. Clients can request various services of the workers, like joining teams and making checkpoints from object memory snapshots.

These two clients are connected to the same relay server. The client on the left created a new team, by sending a message to the relay’s worker. The worker created the team, and told the relay to notify all of its peers (clients and relays). For now, clients respond by inspecting the new team.

I’ve just made the first system checkpoint, and broadcast the first team event (the creation of a team). Eventually, Naiad will support events for several services, including team chatting and screen-sharing, history management, and application deployment. I’m still eager to hear what events and services you think you would want in a livecoding notification system; please let me know! I expect the first public release of this work to be part of the second 2019 solstice release, on 22 December.

Caffeine updated for Pharo 7

Posted in Caffeine, consulting, Context, livecoding, Smalltalk, SqueakJS with tags , , , , , , on 29 September 2019 by Craig Latta
Pharo 7 running on the SqueakJS virtual machine in Chrome, debugged by Squeak in a DevTools panel

I’ve updated Caffeine to run Pharo 7; please try it out! There was one virtual machine bug (primitivePerformWithArguments wasn’t manipulating the stack correctly), and I had to turn off a few Pharo features (like libGit support, which uses LibC, something I haven’t faked in the virtual machine yet).

Many thanks to the Pharo hackers in the RMOD team at INRIA Lille, for hosting me at their sprint on Friday, 27 September 2019. It was great hanging out and coding with you all. We’ll get that Pharo Apple Watch screenshot soon. :)

Exploring the Netflix player with the Caffeine Chrome extension

Posted in Caffeine, consulting, Context, livecoding, music, Smalltalk, SqueakJS with tags , , , , , , , , , , on 22 September 2019 by Craig Latta
debugging Better Call Saul with Caffeine

With the latest version of the Caffeine Chrome extension, you can run Caffeine in a Chrome DevTools panel, with access to all the Chrome debugging APIs. I’ve been using it to explore the Netflix video player, for an app I’m writing that enables the viewer to edit narratives by rearranging scenes.

From a quick look at the DOM element tree for the player, it’s apparent that it’s a React app. By following a reference chain from a user interface element (like the skip-forward button), through the bound “this” object of its click-event listener, I found the internal React properties for all the player’s UI elements, and all the player functions they use (for example, for seeking forward in a video).

With those functions in hand, I made a Netflix player class in Smalltalk, which can manipulate the Netflix player React app interactively from Smalltalk code. Other objects I made representing show elements (like scenes, episodes, seasons, and series) can use my player to compile analytic information about shows, and present them in different ways. For example, you could watch an episode of Better Call Saul consisting only of scenes that include a certain character, or that take place at a certain location, or with flashbacks placed in chronological order. This is for a webapp I’m writing called Arc.

I’m eager to see what else you explore using the Caffeine extension in the DevTools!