`Azure OpenAI` Backend LLM Implementation

This section details the Azure OpenAI implementation of the BaseBackendLLM interface.

memora.llm_backends.AzureOpenAIBackendLLM

AzureOpenAIBackendLLM(
    azure_openai_client: AsyncAzureOpenAI = None,
    model: str = "gpt-4o",
    temperature: float = 0.7,
    top_p: float = 1,
    max_tokens: int = 1024,
)

Bases: BaseBackendLLM

PARAMETER	DESCRIPTION
`azure_openai_client`	A pre-initialized Async Azure OpenAI client TYPE: `AsyncAzureOpenAI` DEFAULT: `None`
`model`	The name of the Azure OpenAI model to use TYPE: `str` DEFAULT: `'gpt-4o'`
`temperature`	The temperature to use for sampling TYPE: `float` DEFAULT: `0.7`
`top_p`	The top_p value to use for sampling TYPE: `float` DEFAULT: `1`
`max_tokens`	The maximum number of tokens to generate TYPE: `int` DEFAULT: `1024`

Example

from openai import AsyncAzureOpenAI
from memora.llm_backends import AzureOpenAIBackendLLM

azure_openai_backend_llm = AzureOpenAIBackendLLM(
    azure_openai_client=AsyncAzureOpenAI(
        azure_endpoint="AZURE_OPENAI_ENDPOINT",
        api_key="AZURE_OPENAI_API_KEY",
        api_version="API_VERSION", # e.g "2024-08-01-preview" or later
        max_retries=3
        )
    )

Source code in memora/llm_backends/azure_openai_backend_llm.py

def __init__(
    self,
    azure_openai_client: AsyncAzureOpenAI = None,
    model: str = "gpt-4o",
    temperature: float = 0.7,
    top_p: float = 1,
    max_tokens: int = 1024,
):
    """
    Initialize the AzureOpenAIBackendLLM class with the Azure OpenAI client and specific parameters.

    Args:
        azure_openai_client (AsyncAzureOpenAI): A pre-initialized Async Azure OpenAI client
        model (str): The name of the Azure OpenAI model to use
        temperature (float): The temperature to use for sampling
        top_p (float): The top_p value to use for sampling
        max_tokens (int): The maximum number of tokens to generate

    Example:
        ```python
        from openai import AsyncAzureOpenAI
        from memora.llm_backends import AzureOpenAIBackendLLM

        azure_openai_backend_llm = AzureOpenAIBackendLLM(
            azure_openai_client=AsyncAzureOpenAI(
                azure_endpoint="AZURE_OPENAI_ENDPOINT",
                api_key="AZURE_OPENAI_API_KEY",
                api_version="API_VERSION", # e.g "2024-08-01-preview" or later
                max_retries=3
                )
            )
        ```
    """

    self.azure_client = azure_openai_client
    self.model = model
    self.temperature = temperature
    self.top_p = top_p
    self.max_tokens = max_tokens

Attributes

azure_client `instance-attribute`

azure_client = azure_openai_client

get_model_kwargs `property`

get_model_kwargs: Dict[str, Any]

Returns dictionary of model configuration parameters

max_tokens `instance-attribute`

max_tokens = max_tokens

model `instance-attribute`

model = model

temperature `instance-attribute`

temperature = temperature

top_p `instance-attribute`

top_p = top_p

Functions

call `async`

__call__(
    messages: List[Dict[str, str]],
    output_schema_model: Type[BaseModel] | None = None,
) -> Union[str, BaseModel]

Process messages and generate response (📌 Streaming is not supported, as full response is required at once)

PARAMETER	DESCRIPTION
`messages`	List of message dicts with role and content e.g [{"role": "user", "content": "Hello!"}, ...] TYPE: `List[Dict[str, str]]`
`output_schema_model`	Optional Pydantic base model for structured output (📌 Ensure the api version and selected model supportes this.) TYPE: `Type[BaseModel] \| None` DEFAULT: `None`

RETURNS	DESCRIPTION
`Union[str, BaseModel]`	Union[str, BaseModel]: Generated text response as a string, or an instance of the output schema model if specified

Source code in memora/llm_backends/azure_openai_backend_llm.py

@override
async def __call__(
    self,
    messages: List[Dict[str, str]],
    output_schema_model: Type[BaseModel] | None = None,
) -> Union[str, BaseModel]:
    """
    Process messages and generate response (📌 Streaming is not supported, as full response is required at once)

    Args:
        messages (List[Dict[str, str]]): List of message dicts with role and content e.g [{"role": "user", "content": "Hello!"}, ...]
        output_schema_model (Type[BaseModel] | None): Optional Pydantic base model for structured output (📌 Ensure the api version and selected model supportes this.)

    Returns:
        Union[str, BaseModel]: Generated text response as a string, or an instance of the output schema model if specified
    """

    if output_schema_model:
        response = await self.azure_client.beta.chat.completions.parse(
            messages=messages,
            **self.get_model_kwargs,
            response_format=output_schema_model,
        )
        return response.choices[0].message.parsed
    else:
        response = await self.azure_client.chat.completions.create(
            messages=messages,
            **self.get_model_kwargs,
        )
        return response.choices[0].message.content

close `async`

close() -> None

Closes the LLM connection.

Source code in memora/llm_backends/azure_openai_backend_llm.py

@override
async def close(self) -> None:
    """Closes the LLM connection."""

    await self.azure_client.close()
    self.azure_client = None

Azure OpenAI Backend LLM Implementation

memora.llm_backends.AzureOpenAIBackendLLM

Attributes

azure_client instance-attribute

get_model_kwargs property

max_tokens instance-attribute

model instance-attribute

temperature instance-attribute

top_p instance-attribute

Functions

__call__ async

close async

`Azure OpenAI` Backend LLM Implementation

azure_client `instance-attribute`

get_model_kwargs `property`

max_tokens `instance-attribute`

model `instance-attribute`

temperature `instance-attribute`

top_p `instance-attribute`

call `async`

close `async`