O narzędziu
Llamafile (Mozilla-AI) — distribute i run LLMs jako single self-contained executable. v0.10.0 (III 2026) — significant architectural overhaul: Whisper (speech recognition), multimodal models, tool calling, Anthropic Messages API support. Hybrid TUI chat/server mode, CLI modality, --image argument. Metal GPU (macOS) out-of-the-box, restored CUDA. Cosmopolitan llama.cpp build — portability + bundled weights.
Zastosowanie
- •Single-file LLM distribution (offline + air-gapped).
- •Multimodal apps (text + image + speech).
- •Whisper transcription lokalnie.
- •Tool calling dla agents.
- •Cross-platform deployment bez Docker.
Nowe funkcje
- •Hybrid TUI chat/server mode.
- •CLI modality dla one-shot questions.
- •Improved logging i argument handling.
- •--image argument dla images.
- •Metal GPU (macOS) out-of-the-box.
- •Restored NVIDIA CUDA support.
Funkcje dodatkowe
▶Single Executable
Pojedynczy plik wykonywalny z bundle weights — no install, no Docker, no Python environment. Pobierasz plik, uruchamiasz, dziala. Najmnejsza bariera wejscia do local LLM w branzy.
▶Cosmopolitan llama.cpp Build
Built on top of Cosmopolitan libc — pojedynczy executable dziala native na Linux, macOS, Windows, FreeBSD i OpenBSD. Cross-platform portability bez kompilacji per OS.
▶Whisper (Speech Recognition)
Wbudowany Whisper dla speech-to-text — multimodal beyond text. Pozwala na voice-based interaction z lokalnym LLM bez external transcription service.
▶Multimodal Models Support
Wsparcie dla multimodal models (vision, audio + text) w v0.10.0 (III 2026). --image argument dla images, audio przez Whisper integration. Full multimodal w single file.
▶Tool Calling
Tool calling support dla agents — modele moga wywolywac external tools. Polaczone z single-file deployment, daje agent capabilities bez complex setup.
▶Anthropic Messages API Support
Nowe w v0.10.0 — Anthropic Messages API compatibility. Pozwala uzywac Llamafile z istniejacymi Anthropic SDK i tools (jak Claude Code).
▶Hybrid TUI Chat/Server Mode
TUI (text-based) chat mode w terminalu — quick interaction bez GUI. Plus server mode dla API access. CLI modality dla one-shot questions w shell scripts.
▶--image Argument
Argument --image dla podawania obrazow do multimodal models z CLI. Pozwala na quick OCR, image description, visual question answering bez koniecznosci budowania API.
▶Metal GPU + CUDA Support
Out-of-the-box Metal GPU support (macOS) i restored NVIDIA CUDA support. Auto-detect best backend dla danego hardware bez manual configuration.
▶Cross-platform/Cross-architecture
Linux, macOS, Windows; x86 i ARM. Mozesz przeniesc executable z desktopa na server, z Intel na Apple Silicon bez rekompilacji. Mozilla project, Apache 2.0 license.
✓ Zalety
Cennik
- •Open-source (Apache 2.0).
- •$0.
- •GitHub: mozilla-ai/llamafile.
API i integracje
- •Single executable file (no install).
- •Cross-platform (Linux, macOS, Windows).
- •Cross-architecture (x86, ARM).
- •OpenAI API compatible (Anthropic added).
- •Bundle weights w executable.
v0.10.0 (III 2026)
- •Significant architectural overhaul (rebuilt from scratch).
- •Cosmopolitan llama.cpp build — portability + bundled weights.
- •Whisper (speech recognition) — multimodal beyond text.
- •Multimodal models support.
- •Tool calling.
- •Anthropic Messages API support.
Aktualne ograniczenia (v0.10.0)
- •Stable diffusion code obecny ale not ported do new build.
- •Pledge() i SECCOMP sandboxing absent.
- •Llamafiler (embeddings) rolled back do llama.cpp built-in.
- •Niektóre CLI args from earlier versions not yet operational.
