Archive for OOP

Context-Aware AI Conversations in Smalltalk

Posted in Appsterdam, Caffeine, consulting, Context, livecoding, Smalltalk, SqueakJS with tags , , , , , , , , , , on 3 December 2024 by Craig Latta
There’s a lot of Smalltalk knowledge in the pre-training data of most LLMs.

I’ve been stumbling toward a “good enough” understanding of Smalltalk by an AI large language model, and Smalltalk tools for integrating conversations into the workflow. So far, I’ve been doing this through model fine-tuning with English system prompts, without resorting to code at all. I’ve been impressed with the results. It seems the pre-training that the OpenAI gpt-4o model has about Smalltalk and Squeak is a decent basis for further training. I evolve the prompts in response to chat completion quality (usually by applying more constraints, like “When writing code, don’t send a message to access an object when you can access it directly with an instance variable.”).

I wanted to converse with the language model from any text pane in Squeak, via the classic “do it”, “print it”, and “inspect it” we’re used to using with Smalltalk code. I changed Compiler>>evaluateCue:ifFail: to handle UndeclaredVariable exceptions, by delegating to the model object underlying the text pane in use. (It’s usually an UndeclaredVariable exception that happens first when one attempts to evaluate an English phrase. For example, “What” in “What went wrong?” is unbound.) That model object, in turn, handles the exception by interpreting the next chat completion from the language model.

The model objects I’ve focused on so far are instances of Debugger and Inspector. One cute thing about this approach is that it records do-its for English prompts just like it does for Smalltalk code, in the changes log. Each model can supply its own system prompts to orient conversations, and can interpret chat completions in a variety of ways (like running Smalltalk code written by the language model). Each model object also keeps a reference to its most recent chat completion, so that successive prompts are submitted to the language model in the context of the complete conversation so far.

With all this in place, evaluating “What went wrong?” in a debugger text pane gives surprisingly correct, detailed, and useful answers. Running the code answered to “Write code for selecting the most recent context with a BlockClosure receiver.” manipulates the debugger correctly.

Next, I’m experimenting with prompts for describing an application’s domain, purpose, and user interface. I’m eager to see where this leads. :)