Cyrus Nikolaidis

Cyrus Nikolaidis

AI Research Engineer · Meta · New York

I build systems that help make AI safer — guardrails, evaluations, and monitoring for large language models.

I'm a Tech Lead at Meta's Superintelligence Labs, working on AI trust monitoring. Previously I led efforts to incubate the use of ML for privacy and security challenges at Meta.

Much of our work has been open-sourced and widely adopted by the community: CyberSecEval, LlamaFirewall, PromptGuard, and LlamaGuard.

Before all this, I worked on recommendation systems — I built the first version of Instagram's feed ranking and worked on Facebook search.

Updates

Aug 2025
Spoke at Blackhat Arsenal and The Diana Initiative in Las Vegas.
Jun 2025
Joined Meta's Superintelligence Labs as an AI Research Engineer, working on safety evaluations and monitoring.
Apr 2025
Released Llama Prompt Guard 2 86M and 22M — we believe these to be the strongest jailbreak detection models available.
Apr 2025
Published LlamaFirewall — a framework for deploying AI agent defenses. (Website, GitHub)
Feb 2025
Hosted a workshop and demo at AI Security Forum in Paris.
Aug 2024
Spoke at DefCon AI Village on evaluations and guardrails against prompt injection attacks.
Aug 2024
Hosted a workshop at AI Security Forum.
Jul 2024
Published CyberSecEval 3 — advancing the evaluation of cybersecurity risks in LLMs.
Jul 2024
Co-authored The Llama 3 Herd of Models paper — our team contributed cybersecurity risk measurements and system-level safety models.
Jul 2024
Launched PromptGuard and LlamaGuard 3, part of Meta's system-level safety stack for Llama.
Apr 2024
Published CyberSecEval 2 — a wide-ranging cybersecurity evaluation suite for LLMs.
Dec 2023
Published Purple Llama CyberSecEval — a secure coding benchmark for language models.