Strona głównaNarzędzia AIHosting Modeli AITogether AI
Together AI

Together AI

0(0)·Hosting Modeli AI
Pay-per-token + GPU hourlyOdwiedź stronę →

O narzędziu

Together AI — AI Native Cloud dla 15+ open-source modeli. Inference API: Llama 3.3 8B $0.18/M, 70B $0.88/M, Mixtral 8x22B $1.20/M, Llama 4 Scout $0.18/$0.59/M. Fine-tuning LoRA: Llama 3.3 8B $4.50/M, 70B $14/M, Mixtral $16/M. Full FT na 8xH100: Llama 8B $12/h, 70B $22/h, Mixtral $28/h. GPU Cloud: H100 $3.49/h, H200 $4.19/h, B200 $7.49/h.

Funkcje dodatkowe

Inference API (15+ open-source models)

Pay-per-token API do 15+ open-source modeli — Llama, DeepSeek, Qwen, Mistral, Mixtral, Gemma 2. Najszerszy katalog open-source LLM w branzy inference API.

Llama Models

Llama 3.3 8B ($0.18/M), Llama 3.3 70B ($0.88/M), Llama 4 Scout ($0.18/$0.59/M input/output), Llama 4 Maverick ($0.27/$0.85/M). Day-0 support dla najnowszych Meta releases.

DeepSeek R1/V3 Support

DeepSeek R1 (reasoning model konkurujacy z o1) i DeepSeek V3 dostepne natychmiast po release. Chinski lider open-source LLM z performance comparable do frontier models.

Fine-tuning (LoRA + Full FT)

LoRA fine-tuning od $4.50/M tokens (Llama 8B) do $16/M (Mixtral 8x22B). Full fine-tuning na 8xH100: $12-$28/h. Pelen lifecycle od inference do fine-tuningu na jednej platformie.

GPU Cloud (H100/H200/B200)

NVIDIA HGX H100 ($3.49/h), HGX H200 ($4.19/h), HGX B200 ($7.49/h, Blackwell). Pay-as-you-go lub reserved capacity dla duration >6 dni z dyskontami volume.

Mixtral / Qwen / Mistral / Gemma

Pelne wsparcie dla rodzin Mixtral 8x22B, Qwen 2.5, Mistral, Gemma 2. Razem z Llama i DeepSeek pokrywa wiekszosc enterprise use cases bez vendor lock-in u providerow propriety.

Pay-as-you-go + Reserved

Pay-as-you-go dla startupow i sporadycznego uzycia. Reserved capacity dla durations powyzej 6 dni — dramatically lower pricing dla stable production workloads.

NVIDIA Blackwell B200

Dostep do najnowszych NVIDIA B200 Blackwell GPU ($7.49/h) — 2x performance vs H100 dla LLM inference i training. Wczesny dostep do frontier hardware dla AI labs i power users.

Multi-GPU Training (8xH100)

Distributed training na 8xH100 nodes dla full fine-tuningu 70B+ modeli. Pre-configured stack: NCCL, distributed PyTorch, DeepSpeed — zero setup time.

Open-source Model Catalog

Largest curated katalog open-source LLM w branzy — 15+ starannie wybranych modeli z poszczegolnymi pricingiem i benchmarks. Single source of truth dla open-source AI.

✓ Zalety

+15+ open-source modeli przez 1 API
+Inference Llama 3.3 8B tylko $0.18/M
+Full FT 8xH100 od $12/h (najtaniej)
+B200 $7.49/h (Blackwell available)
+LoRA i Full Fine-tuning support
+Reserved capacity (>6 dni)
🧠

Modele (15+ open-source)

  • Llama 3.3 8B/70B, Llama 4 Scout/Maverick.
  • DeepSeek R1, DeepSeek V3.
  • Qwen 2.5.
  • Mistral, Mixtral 8x22B.
  • Gemma 2.
💰

Cennik

  • Inference: pay-per-token (od $0.18/M Llama 8B do $1.20/M Mixtral 8x22B).
  • Fine-tuning LoRA: $4.50-$16/M training tokens.
  • Full FT 8xH100: $12-$28/h.
  • GPU Cloud: H100 $3.49/h, H200 $4.19/h, B200 $7.49/h.
  • Reserved capacity dla durations >6 dni.
🔗

Inference API (per token)

  • Llama 3.3 8B: $0.18/M tokens.
  • Llama 3.3 70B: $0.88/M.
  • Mixtral 8x22B: $1.20/M.
  • Llama 4 Scout: $0.18/$0.59/M (input/output).
  • Llama 4 Maverick: $0.27/$0.85/M.
  • DeepSeek R1, V3, Qwen 2.5, Mistral, Mixtral, Gemma 2.
📋

Fine-tuning

  • LoRA: Llama 3.3 8B $4.50/M training tokens, 70B $14/M, Mixtral 8x22B $16/M.
  • Full fine-tuning na 8xH100: Llama 8B $12/h, 70B $22/h, Mixtral 8x22B $28/h.
📋

GPU Cloud (hourly)

  • NVIDIA HGX H100: $3.49/h.
  • NVIDIA HGX H200: $4.19/h.
  • NVIDIA HGX B200: $7.49/h (Blackwell).
  • Pay-as-you-go lub reserved (powyżej 6 dni).

Szczegóły

CenaPay-per-token + GPU hourly
KategoriaHosting Modeli AI
Inference APIFine-tuningH100 $3.49B200 $7.4915+ open-source models