LLMs
The LLMClient is the primary interface for interacting with Large Language Models (LLMs) in the Helios Engine. It provides a unified API for both remote LLMs (like OpenAI) and local LLMs (via llama.cpp).
The LLMClient
The LLMClient is responsible for sending requests to the LLM and receiving responses. It can be created with either a Remote or Local provider type.
Creating an LLMClient
Here's how to create an LLMClient with a remote provider:
use helios_engine::{llm::{LLMClient, LLMProviderType}, config::LLMConfig}; #[tokio::main] async fn main() -> helios_engine::Result<()> { let llm_config = LLMConfig { model_name: "gpt-3.5-turbo".to_string(), base_url: "https://api.openai.com/v1".to_string(), api_key: std::env::var("OPENAI_API_KEY").unwrap(), temperature: 0.7, max_tokens: 2048, }; let client = LLMClient::new(LLMProviderType::Remote(llm_config)).await?; Ok(()) }
And here's how to create an LLMClient with a local provider:
#[cfg(feature = "local")] use helios_engine::{llm::{LLMClient, LLMProviderType}, config::LocalConfig}; #[tokio::main] async fn main() -> helios_engine::Result<()> { let local_config = LocalConfig { huggingface_repo: "unsloth/Qwen3-0.6B-GGUF".to_string(), model_file: "Qwen3-0.6B-Q4_K_M.gguf".to_string(), temperature: 0.7, max_tokens: 2048, }; let client = LLMClient::new(LLMProviderType::Local(local_config)).await?; Ok(()) }
Note: To use the local provider, you must install Helios Engine with the local feature enabled.
Sending Requests
Once you have an LLMClient, you can send requests to the LLM using the chat method.
Simple Chat
Here's a simple example of how to send a chat request:
use helios_engine::{llm::{LLMClient, LLMProviderType}, config::LLMConfig, ChatMessage}; #[tokio::main] async fn main() -> helios_engine::Result<()> { let llm_config = LLMConfig { model_name: "gpt-3.5-turbo".to_string(), base_url: "https://api.openai.com/v1".to_string(), api_key: std::env::var("OPENAI_API_KEY").unwrap(), temperature: 0.7, max_tokens: 2048, }; let client = LLMClient::new(LLMProviderType::Remote(llm_config)).await?; let messages = vec![ChatMessage::user("Hello, world!")]; let response = client.chat(messages, None, None, None, None).await?; println!("Assistant: {}", response.content); Ok(()) }
Streaming Responses
The LLMClient also supports streaming responses. Here's an example of how to use the chat_stream method:
use helios_engine::{llm::{LLMClient, LLMProviderType}, config::LLMConfig, ChatMessage}; #[tokio::main] async fn main() -> helios_engine::Result<()> { let llm_config = LLMConfig { model_name: "gpt-3.5-turbo".to_string(), base_url: "https://api.openai.com/v1".to_string(), api_key: std::env::var("OPENAI_API_KEY").unwrap(), temperature: 0.7, max_tokens: 2048, }; let client = LLMClient::new(LLMProviderType::Remote(llm_config)).await?; let messages = vec![ChatMessage::user("Hello, world!")]; let response = client.chat_stream(messages, None, None, None, None, |chunk| { print!("{}", chunk); }).await?; Ok(()) }