← Back to LLM Wiki
LLM Wiki · Open Weights · Self-Hostable

Hermes

Nous Research's Hermes series — open-weights fine-tunes with best-in-class function calling, steerable alignment, and an agent-native training stance.
Hermes is the reference open-weights fine-tune for agent and tool-use workloads. Built on Llama, Qwen, and Mistral bases, the series is trained with heavy emphasis on instruction following, structured output, and long-form reasoning — with alignment tuned to respect the system prompt rather than override it. Hermes 2 Pro and Hermes 4 are widely deployed as drop-in replacements for closed-API function calling.
Nous Research Open Weights Function Calling Agentic Steerable

Quick Facts

Vendor
Nous Research (independent lab; Teknium, Karan, interstellarninja, others)
Released
OpenHermes / Hermes 2 (2023–2024); Hermes 3 (August 2024); Hermes 4 (2025)
Current line
Hermes 4 · Hermes 3 (405B, 70B, 8B) · Hermes 2 Pro · DeepHermes
Base models
Llama 3.x, Qwen 2.5, Mistral, Phi (variant-dependent)
License
Inherited from base — Llama Community License, Apache 2.0, or Qwen license per variant
Hosting
Self-hosted (vLLM, Ollama); hosted via Nous Chat, Hyperbolic, Together, Fireworks, OpenRouter
Context window
128K+ tokens via YaRN extensions
Training approach
SFT + DPO on curated agentic / function-calling / reasoning data

Summary

Hermes is the flagship fine-tune line from Nous Research, an independent research collective that has shipped more high-signal open-weights work than most well-funded labs. The design stance is "neutrally aligned" — Hermes respects the system prompt as the source of truth for persona and behavior rather than overriding it with safety layers baked deep into the weights. For agent infrastructure this is the point: if you need the model to adopt a role, use a tool, or produce structured output, Hermes is built for that.

Three technical contributions matter. (1) Function calling: Hermes 2 Pro introduced an open-source function-calling format with fine-tunes explicitly trained on it, and Hermes 4 continues that lineage with competitive tool-call reliability against frontier closed models. (2) YaRN context extension: the Nous team authored the YaRN paper and shipped extended-context variants that hold quality past 128K. (3) Agentic training data: DPO and SFT mixes include long-horizon reasoning traces, which materially improves multi-step tool loops over stock base-model fine-tunes.

Model Lineup

Where Hermes Fits

Hermes is the default choice when you need open-weights function calling at the quality tier agent frameworks assume. Pick Hermes over a stock base model when the workload involves: structured JSON output under strict schemas, multi-step tool loops with retries, or personas driven by the system prompt rather than fine-tuned in. The 8B / 14B variants run comfortably on Mac Mini edge hardware, making Hermes a credible self-hosted alternative to hosted function-calling APIs.

Tradeoffs

Deployment Notes

Within the Claw ecosystem, Hermes variants are a first-class alternative to stock Qwen3-Coder for agent loops where system-prompt fidelity matters more than raw coding ability — ops agents, research agents, long-running orchestrators. Hermes 3 8B and Hermes 4 14B run well on Mac Mini edge nodes via vLLM or Ollama. Function-calling-heavy workloads that hit reliability ceilings on stock base models are usually fixed by switching to Hermes Pro or Hermes 4 with no other code changes.

FrawdBot still sits in front as a moderation layer — given Hermes's steerable alignment, the policy boundary lives in infrastructure, not in the weights.

References

  1. Nous Research
  2. Nous Research on Hugging Face
  3. YaRN: Efficient Context Window Extension of Large Language Models
  4. Llama — LLM Wiki
  5. Qwen — LLM Wiki
  6. The Agent Infrastructure Stack — Organized AI