Together
Backend LLM Implementation
This section details the Together implementation of the BaseBackendLLM
interface.
memora.llm_backends.TogetherBackendLLM
TogetherBackendLLM(
api_key: str,
model: str = "meta-llama/Llama-3.3-70B-Instruct-Turbo",
temperature: float = 0.7,
top_p: float = 1,
max_tokens: int = 1024,
max_retries: int = 3,
)
Bases: BaseBackendLLM
PARAMETER | DESCRIPTION |
---|---|
api_key
|
The API key to use for authentication
TYPE:
|
model
|
The name of the Together model to use
TYPE:
|
temperature
|
The temperature to use for sampling
TYPE:
|
top_p
|
The top_p value to use for sampling
TYPE:
|
max_tokens
|
The maximum number of tokens to generate
TYPE:
|
max_retries
|
The maximum number of retries to make if a request fails
TYPE:
|
Example
Source code in memora/llm_backends/together_backend_llm.py
Attributes
get_model_kwargs
property
Returns dictionary of model configuration parameters
together_client
instance-attribute
Functions
__call__
async
__call__(
messages: List[Dict[str, str]],
output_schema_model: Type[BaseModel] | None = None,
) -> Union[str, BaseModel]
Process messages and generate response (📌 Streaming is not supported, as full response is required at once)
PARAMETER | DESCRIPTION |
---|---|
messages
|
List of message dicts with role and content e.g [{"role": "user", "content": "Hello!"}, ...]
TYPE:
|
output_schema_model
|
Optional Pydantic base model for structured output (📌 Ensure the choosen model supports this)
TYPE:
|
RETURNS | DESCRIPTION |
---|---|
Union[str, BaseModel]
|
Union[str, BaseModel]: Generated text response as a string, or an instance of the output schema model if specified |