Simple Sunflower inference endpoint for single instruction/response.
This is a simplified interface for users who want to send a single instruction
rather than managing conversation history. Uses form-based input for easier
integration with simple clients.
Args:
request: The FastAPI request object (required for rate limiting).
background_tasks: FastAPI background tasks for async operations.
instruction: The question or instruction for the AI.
model_type: Either 'qwen' (default) or 'gemma'.
temperature: Controls randomness (0.0 = deterministic, 1.0 = creative).
system_message: Optional custom system message.
db: Database session.
current_user: The authenticated user.
Returns:
Dictionary containing response, model_type, processing_time, usage, and success.
Raises:
BadRequestError: For validation errors.
ValidationError: For invalid model type.
ServiceUnavailableError: If the model is loading or request times out.
ExternalServiceError: For unexpected errors.
Example:
Form data:
- instruction: "Translate 'hello' to Luganda"
- model_type: "qwen"
- temperature: 0.3
Response:
{
"response": "In Luganda, 'hello' is 'Gyebaleko' or 'Wasuze otya' (Good morning).",
"model_type": "qwen",
"processing_time": 1.5,
"usage": {"completion_tokens": 20, "prompt_tokens": 10, "total_tokens": 30},
"success": true
}