A new evaluation led by LatticeFlow AI, in collaboration with SambaNova, demonstrates that open-source GenAI models, when properly protected, can meet or even exceed the security standards of closed models. The findings suggest these models are now suitable for enterprise deployment—even in highly regulated sectors such as financial services.
Quantifying Security for Open-Source Models
The study assessed the top five open models, measuring security before and after applying input-filtering guardrails designed to block malicious or manipulative prompts. Results were striking: security scores surged from as low as 1.8% to nearly 99.6%, all while maintaining over 98% quality of service.
“LatticeFlow AI’s evaluation confirms that with the right safeguards, open-source models are enterprise-ready for regulated industries, providing transformative advantages in cost efficiency, customization, and responsible AI governance,” said Harry Ault, Chief Revenue Officer at SambaNova.
Why Enterprises Care
Many companies are exploring open-source GenAI to reduce vendor lock-in, accelerate innovation, and gain flexibility. Adoption has often stalled due to unclear metrics on security and risk. This evaluation addresses that gap, offering technical, quantifiable evidence to guide enterprise deployment.
“At LatticeFlow AI, we provide the deepest technical controls to evaluate GenAI security and performance,” said Dr. Petar Tsankov, CEO and Co-Founder of LatticeFlow AI. “These insights give AI, risk, and compliance leaders the clarity they’ve been missing, empowering them to move forward with open-source GenAI safely and confidently.”
Key Findings
The evaluation tested five widely used open foundation models:
- Qwen3-32B
- DeepSeek-V3-0324
- LLaMA-4 Maverick-17B-128E-Instruct
- DeepSeek-R1
- LLaMA-3.3 70B Instruct
Each model was assessed in two configurations: the base model and a guardrailed version enhanced with a dedicated input-filtering layer to block adversarial prompts. Simulated attack scenarios—such as prompt injection and manipulation—were used to measure resilience without impacting usability.
Results after Guardrails:
- DeepSeek-R1: 1.8% → 98.6%
- LLaMA-4 Maverick: 33.5% → 99.4%
- LLaMA-3.3 70B Instruct: 51.8% → 99.4%
- Qwen3-32B: 56.3% → 99.6%
- DeepSeek-V3: 61.3% → 99.4%
All models maintained over 98% quality of service, confirming that security improvements did not compromise user experience.
Implications for Financial Services and Regulated Industries
As GenAI moves from experimentation to production, enterprises face increased scrutiny from regulators, boards, and internal risk teams. Solutions must be auditable, controllable, and demonstrably secure.
This evaluation provides clear, quantitative evidence that open-source models—when equipped with proper risk guardrails—can satisfy enterprise-grade security requirements, potentially reshaping adoption strategies for regulated industries.
Power Tomorrow’s Intelligence — Build It with TechEdgeAI