This safeguard model has 8B parameters and is based on the Llama 3 family. It can do both prompt and response classification. LlamaGuard 2 acts as a normal LLM would, generating text that indicates whether the given input/output is safe/unsafe. If deemed unsafe, it will also share the content categories violated. For best results, please use raw prompt input or the completions endpoint, instead of the chat API. Usage of this model is subject to Meta's Acceptable Use Policy.
Try Now8,192 tokens
8,192 tokens
$0.20
$0.20
Safety classifier; not a chat model; superseded by Guard 3