Back to Benchmarks

functiongemma-270m:latest (ollama)

Tiny (270M Function)268.10MFunction-callingQ4_K_MThinking: high2026-05-13
0% 0/2 tasks passed
Model class: Tiny (270M Function) Parameters: 268.10M Architecture: Function-calling Quantization: Q4_K_M Thinking: high Backend: ollama GPU: GPU (via Ollama host) CPU: AMD Ryzen 9 5900X 12-Core Processor RAM: 121GB Speed: 20.1 tok/s output-est Total time: 9.1m Failure modes: None

Task Results

Click any task row to expand the full prompt, conversation transcript, and judge evaluation.

TaskCategoryScoreSpeedTimeFailure
Easy JSON Fact Extraction structured_output 0/5 20.1 tok/s output-est 23.6s
Easy Single-Step Tool Intent tool_intent 0/5 N/A 23.6s

Methodology

Each task is scored by an LLM judge against the task rubric after the run is inspected for harness errors. A task counts as a pass when it scores at least 60%. Speed is measured in tokens per second when available. Hardware is auto-detected including WSL2 GPU detection.