I don't get it. It only response safe or unsafe?
Interesting, is this able to run in parallel with my coding model?
What is the point?It responds with either 2 lines or 1 line. 2 Lines = "unsafe\nS1" where S1 or S? is the category violated (go to ollama and search for the model to see the categories). 1 Line = "safe".
Task: Check if the following text is safe.
Categories:
- Violence
- Hate
- Sexual
- Self-harm
- Criminal
- etc.
Answer ONLY in this format:
SAFE
or
UNSAFE: <category>
Text:
{{INPUT}}
async function isSafe(text) {
const prompt = `
Task: Check if the following text is safe.
Answer ONLY:
SAFE
or
UNSAFE: <category>
Text:
${text}
`;
const result = await callOllama("llama-guard3:8b", prompt);
return result.startsWith("SAFE");
}
async function safeChat(userInput) {
// 1. Check input
const inputSafe = await isSafe(userInput);
if (!inputSafe) {
return "Input blocked due to safety policy.";
}
// 2. Generate response
const response = await callOllama("llama3", userInput);
// 3. Check output
const outputSafe = await isSafe(response);
if (!outputSafe) {
return "Response blocked due to safety policy.";
}
return response;
}
I just asked ChatGPT to explain to me...
User → Llama Guard → Main LLM → Llama Guard → User.
Practical use cases
You’d use Llama Guard 3 (8B) if you are:
- Building a chatbot and need content moderation
- Running local LLMs and want safety without external APIs
- Creating AI agents with tools (search/code execution)
- Implementing compliance filtering (enterprise / public apps)
Template...
B4X:Task: Check if the following text is safe. Categories: - Violence - Hate - Sexual - Self-harm - Criminal - etc. Answer ONLY in this format: SAFE or UNSAFE: <category> Text: {{INPUT}}
B4X:async function isSafe(text) { const prompt = ` Task: Check if the following text is safe. Answer ONLY: SAFE or UNSAFE: <category> Text: ${text} `; const result = await callOllama("llama-guard3:8b", prompt); return result.startsWith("SAFE"); } async function safeChat(userInput) { // 1. Check input const inputSafe = await isSafe(userInput); if (!inputSafe) { return "Input blocked due to safety policy."; } // 2. Generate response const response = await callOllama("llama3", userInput); // 3. Check output const outputSafe = await isSafe(response); if (!outputSafe) { return "Response blocked due to safety policy."; } return response; }
Thank you so much @hatzisn for pointing this tool out.. awesome.
I assume most LLMs are already guard railed.
I just wondering if the llama-guard3 responded "safe" but it may still flags as "unsafe" by other models.
Does the llama-guard3 already tested on all models?
If no, I don't see the point here.
Well, in all models that we are currently using, there is a disclaimer that is made that, the information that is provided might not be accurate.I assume most LLMs are already guard railed.
I just wondering if the llama-guard3 responded "safe" but it may still flags as "unsafe" by other models.
Does the llama-guard3 already tested on all models?
If no, I don't see the point here.
Okay.
If I understand correctly, it seems this model could be useful for integrating a chatbot in our system for clients or end users.
At least it filters simple prompts and the developers are not getting blamed or sued by the end users for not providing a safe chatbot to use by their children.
Some models come with them built in, the one I use (locally) will not talk about
Illicit or illegal activities
Violence, self‑harm, or suicide‑related material
Harassment, hate speech, or discrimination
Adult or sexual content involving minors
Extremist, terrorist, or violent radicalization material
Misinformation or disinformation that could cause harm
Privacy‑invasive or personally identifying information
Copyright‑protected media or software that is shared without permission
Let me try this out...I use Nemotron-3-nano 30b model
ollama launch claude --model nemotron-3-nano:30b-cloud
ollama launch vscode --model nemotron-3-nano:30b-cloud
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?
We use cookies and similar technologies for the following purposes:
Do you accept cookies and these technologies?