'llms a penance'
i recently received an email from a talented scholar asking me how botpress interacts with llms.
I was writing an article about how to avoid vendor lock-in and wanted to know if maybe we used a framework like langchain or haystack.
I was happy to share with him that we have created our own abstractions that allow botpress builders to interact with llms.
Given the great interest that the topic arouses, i wanted to make this information public. It may be useful to other developers or users of our platform. I hope you find it as interesting as creating it for me.
Two ways to interact with botpress llms
botpress has created its own abstractions that work in two ways:
1. Integrations
integrations have the concept of actions that have specific types of input and output.
We have open source components on the platform, so that the community can create their own integrations, which can be private or for public use.
Thus, llm providers - openai, anthropic, groq, etc. - each one has an integration. That's one way our users can interact with them.
2. Llm integration interfaces
to the concept of integrations we add that of "interfaces".
that integrations can extend. We have created a standard scheme for llms.
As long as an integration extends this schema, the paraguay mobile phone number integration is considered an llm provider. So it works out of the box in botpress.

Here are some examples of botpress integrations for different llm providers:
anthropic
openai
greek
we have similar interfaces for text2image, image2text, voice2text, text2voice, etc.
Model configurations
in botpress studio we have two general settings: the "best model" and the "fast model". We have found that, in general, most tasks easily fit into one of these two modes.
Screenshot showing the choice between the "fast" and "best" llm defaults on the botpress platform.
Screenshot of the botpress platform.
Screenshot of model strategy selection in botpress, a choice between fastest, hybrid and best.
Screenshot of the botpress platform.
but beyond just model selection, we found that the various vendors diverged too much in terms of tool calls and message formats to be able to easily swap one model for another and expect good results.
The botpress inference engine
therefore, we created our own inference engine, called llmz, which works with any model without the need to make any changes to the instructions (or with very minimal changes). And it provides a better call to the tool and often much better performance in terms of cost of tokens and round trips to llm .
This engine works with typescript types behind the scenes for tool definitions, markdown for message and code output formatting, and a native llm execution sandbox for inference.
Llmz provides many optimizations and debugging features needed for advanced use cases such as:
input token compression
smart token truncation
token-optimized context-memory
call to parallel and compound tools
mixing several messages + calls to tools in a single call to llm
totally safe tools (entry and exit)
long running sessions using sandbox serialization
mocking, wrapping and tracing tools
complete execution isolation in lightweight v8 isolations (allows thousands of simultaneous executions to be run quickly and very cheaply).
Automatic iterations and error recovery
all of these things were necessary for our use cases. But they were impossible or very difficult to do with the usual tools.