O narzędziu
Together AI — AI Native Cloud dla 15+ open-source modeli. Inference API: Llama 3.3 8B $0.18/M, 70B $0.88/M, Mixtral 8x22B $1.20/M, Llama 4 Scout $0.18/$0.59/M. Fine-tuning LoRA: Llama 3.3 8B $4.50/M, 70B $14/M, Mixtral $16/M. Full FT na 8xH100: Llama 8B $12/h, 70B $22/h, Mixtral $28/h. GPU Cloud: H100 $3.49/h, H200 $4.19/h, B200 $7.49/h.
Funkcje dodatkowe
▶Inference API (15+ open-source models)
Pay-per-token API do 15+ open-source modeli — Llama, DeepSeek, Qwen, Mistral, Mixtral, Gemma 2. Najszerszy katalog open-source LLM w branzy inference API.
▶Llama Models
Llama 3.3 8B ($0.18/M), Llama 3.3 70B ($0.88/M), Llama 4 Scout ($0.18/$0.59/M input/output), Llama 4 Maverick ($0.27/$0.85/M). Day-0 support dla najnowszych Meta releases.
▶DeepSeek R1/V3 Support
DeepSeek R1 (reasoning model konkurujacy z o1) i DeepSeek V3 dostepne natychmiast po release. Chinski lider open-source LLM z performance comparable do frontier models.
▶Fine-tuning (LoRA + Full FT)
LoRA fine-tuning od $4.50/M tokens (Llama 8B) do $16/M (Mixtral 8x22B). Full fine-tuning na 8xH100: $12-$28/h. Pelen lifecycle od inference do fine-tuningu na jednej platformie.
▶GPU Cloud (H100/H200/B200)
NVIDIA HGX H100 ($3.49/h), HGX H200 ($4.19/h), HGX B200 ($7.49/h, Blackwell). Pay-as-you-go lub reserved capacity dla duration >6 dni z dyskontami volume.
▶Mixtral / Qwen / Mistral / Gemma
Pelne wsparcie dla rodzin Mixtral 8x22B, Qwen 2.5, Mistral, Gemma 2. Razem z Llama i DeepSeek pokrywa wiekszosc enterprise use cases bez vendor lock-in u providerow propriety.
▶Pay-as-you-go + Reserved
Pay-as-you-go dla startupow i sporadycznego uzycia. Reserved capacity dla durations powyzej 6 dni — dramatically lower pricing dla stable production workloads.
▶NVIDIA Blackwell B200
Dostep do najnowszych NVIDIA B200 Blackwell GPU ($7.49/h) — 2x performance vs H100 dla LLM inference i training. Wczesny dostep do frontier hardware dla AI labs i power users.
▶Multi-GPU Training (8xH100)
Distributed training na 8xH100 nodes dla full fine-tuningu 70B+ modeli. Pre-configured stack: NCCL, distributed PyTorch, DeepSpeed — zero setup time.
▶Open-source Model Catalog
Largest curated katalog open-source LLM w branzy — 15+ starannie wybranych modeli z poszczegolnymi pricingiem i benchmarks. Single source of truth dla open-source AI.
✓ Zalety
Modele (15+ open-source)
- •Llama 3.3 8B/70B, Llama 4 Scout/Maverick.
- •DeepSeek R1, DeepSeek V3.
- •Qwen 2.5.
- •Mistral, Mixtral 8x22B.
- •Gemma 2.
Cennik
- •Inference: pay-per-token (od $0.18/M Llama 8B do $1.20/M Mixtral 8x22B).
- •Fine-tuning LoRA: $4.50-$16/M training tokens.
- •Full FT 8xH100: $12-$28/h.
- •GPU Cloud: H100 $3.49/h, H200 $4.19/h, B200 $7.49/h.
- •Reserved capacity dla durations >6 dni.
Inference API (per token)
- •Llama 3.3 8B: $0.18/M tokens.
- •Llama 3.3 70B: $0.88/M.
- •Mixtral 8x22B: $1.20/M.
- •Llama 4 Scout: $0.18/$0.59/M (input/output).
- •Llama 4 Maverick: $0.27/$0.85/M.
- •DeepSeek R1, V3, Qwen 2.5, Mistral, Mixtral, Gemma 2.
Fine-tuning
- •LoRA: Llama 3.3 8B $4.50/M training tokens, 70B $14/M, Mixtral 8x22B $16/M.
- •Full fine-tuning na 8xH100: Llama 8B $12/h, 70B $22/h, Mixtral 8x22B $28/h.
GPU Cloud (hourly)
- •NVIDIA HGX H100: $3.49/h.
- •NVIDIA HGX H200: $4.19/h.
- •NVIDIA HGX B200: $7.49/h (Blackwell).
- •Pay-as-you-go lub reserved (powyżej 6 dni).
