ActiveFence, the self-described “AI Safety” company, has published its first AI Security Benchmark Report: Prompt Injections, and the results put its model at the top of the pack.
In testing against six leading guardrails and APIs, ActiveFence’s safety and security model delivered the highest F1 score (0.857) and precision (0.890), while holding its false-positive rate to a competitive 5.4%. For enterprises worried about shipping AI copilots, chatbots, and creative tools without opening the door to jailbreaks and data leaks, that’s a meaningful balance.
Prompt Injections: The Achilles’ Heel of GenAI
Prompt injection—the practice of tricking a model into ignoring its guardrails, leaking sensitive data, or generating harmful content—has become one of the most pressing threats in generative AI deployments. Attackers use layered or indirect instructions to bypass filters, sometimes in subtle ways that standard safety stacks miss.
ActiveFence’s benchmark spanned more than 28,000 benign and adversarial prompts, mapped to OWASP and MITRE ATLAS categories. That scope included everything from jailbreak attempts to safety-critical abuse cases.
The company also stressed that its model isn’t just strong in English. It maintained leading performance across 13 languages, including Chinese, German, Japanese, Korean, and Spanish—a critical feature for global enterprises deploying AI features at scale.
Safety Without User Friction
“No one should have to choose between strong guardrails and great user experience,” said Noam Schwartz, Co-Founder and CEO of ActiveFence. “This benchmark shows you can have both—high coverage and low false positives—so teams can ship AI features confidently, at scale.”
Avi Golan, Chief Product & Engineering Officer, added that the model is built to “travel with AI” across use cases and geographies. “Our model’s multilingual strength and consistently high F1 scores give organizations durable protection,” he said.
Why It Matters
The report lands at a time when nearly every major enterprise is experimenting with AI copilots or customer-facing assistants. But without effective protection, prompt injection can quickly turn these features into liabilities, forcing costly manual reviews or leading to reputational damage.
While other vendors—OpenAI, Microsoft, and startups alike—offer safety layers, ActiveFence is positioning itself as the one delivering the best balance of coverage, usability, and operational cost savings. That balance will likely resonate with enterprises eager to scale GenAI responsibly without frustrating end users.
The takeaway? If prompt injection is the new phishing, ActiveFence wants to be the Proofpoint of AI security.
Power Tomorrow’s Intelligence — Build It with TechEdgeAI