← Documents

Reinforced Security Agent (AI Agent)

🛡️

One line — A self-reinforcing security agent (LLM + ReAct) that replaces rule-of-thumb WAF: it analyzes request context, decides whether it’s an attack, blocks it, and autonomously updates its own defense policy — a closed loop of detection → blocking → policy update → re-detection.

Period
April 1, 2024 – July 10, 2024
Role
System planning, design, and full implementation
Type
In-house R&D PoC (Proof of Concept)

Background & planning

Recent hacking incidents at organizations such as SKT and the Korea Employment Information Service made cybersecurity a major social issue. In our team’s AWS Cloud environment, security checks ran on a WAF with a rule-of-thumb approach, which was weak against newly emerging attack patterns. So I started an R&D PoC for a self-reinforcing security agent that reads the context of incoming requests, judges potential attacks, and autonomously strengthens defense policies.

Request Secure Gateway block / allow Secure Monitoring log-based anomaly detect LLM · ReAct reason → threat assess Policy update + admin alert 탐지 → 차단 → 정책 업데이트 → 재탐지 (self-reinforcing loop)
Reinforced Security Agent replaces WAF — Gateway·Monitoring·LLM(ReAct) form a closed self-reinforcing loop

Impact

  • Real-time anomaly detection — adapts to unpredictable new threats while improving response time and resource efficiency.
  • Self-reinforcing security loop — detection → blocking → policy update → re-detection.
  • Automated policy updates — threat assessments are immediately reflected in Gateway policies.
  • Integrated admin notifications + auto-blocking — better operational efficiency and reliability.

Tech stack

SecurityAgentRAGReActLog PythonFastAPILangGraphLangChain Qdrant (VectorDB)

Key roles & achievements

Designed and implemented a fully automated end-to-end security system (Feedback System) by integrating Secure Monitoring and Secure Gateway agents into an LLM-based ReAct framework — covering log-based detection, real-time blocking and policy reinforcement.

⊙ Monitoring–Gateway integration architecture

  • Designed a self-reinforcing loop: initial blocking → log collection → anomaly detection → automated response.
  • Implemented a bi-directional feedback mechanism between the Gateway and Monitoring agents.
  • Developed a ReAct-based reasoning & policy-reinforcement flow powered by LLMs.

⊙ LLM-based threat detection & response

  • Designed LLM prompts & tools for log summarization and threat assessment.
  • Enabled rapid response via Qdrant vector-DB caching.
  • Built suspicious-request classification (IP, URL, User-Agent) with few-shot techniques.
  • Implemented automated admin notifications and Gateway policy auto-update.

Troubleshooting 1 — Agent memory (token-based message management)

Unlike typical chat systems that continuously exchange messages with a user, here the server delivers only the initial System and Human messages at startup, then runs in a repeated loop. As messages accumulate, the initial System/Human context can eventually drop out. To prevent this, messages are segmented into 10,000-token units, and every new message must explicitly include both the System and Human messages — keeping context consistent over long-term operation.

Case 1 — 초기 메시지 소실 Sys Hum 메시지 누적 → 앞쪽(Sys/Hum)이 잘림 Case 2 — 10k 토큰 단위 + Sys/Hum 항상 포함 [Sys+Hum] chunk · 10k [Sys+Hum] chunk · 10k 각 chunk 가 System·Human 을 명시 포함 → context 유지
장기 운영 시 초기 컨텍스트 소실을 막기 위해 10,000-token 단위로 분할하고 매 메시지에 System·Human 을 포함

Troubleshooting 2 — Log pipeline (long context to the LLM)

Sending the entire raw log to the LLM risks exceeding token limits; conversely, dumping raw logs into a Vector DB and querying them loses the flow/continuity of the logs. Vector storage also explodes — embedding just 13.3 MB of raw log with text-embedding-3-large expands to 914.2 MB.

Embedding modelLocationDimRaw (log)VectorizedTime
text-embedding-3-largeExternal307213.3 MB914.2 MB41m 30.6s
text-embedding-3-smallExternal153613.3 MB754.3 MB20m 44.0s
all-MiniLM-L6-v2Local38413.3 MB625.6 MB22m 53.0s

[Table] Storage expansion by embedding model

Logs대용량 raw Message Broker버퍼 Agentpoll (Tool API) LLM 필요한 만큼만 polling · 요청 건수는 agent 가 자율 결정
Message Broker 기반 파이프라인 — agent 가 Tool API 로 필요한 만큼만 로그를 가져와 LLM 에 전달

Solution: deliver logs through a Message Broker–based pipeline. The agent polls data from the broker only as needed; polling is implemented as a Tool API; the agent invokes the tool and autonomously decides how many entries to request — ensuring efficient context handling.

Troubleshooting 3 — Limited reasoning (Inference Tool)

A ReAct agent mostly focuses on tool invocation with limited autonomous reasoning. I added an Inference Tool (think_aloud) so the agent can reason independently at intermediate steps. Tools can also conflict (Tool 1 says OK, Tool 2 says Not OK; only the last result may be applied) — resolved via prompt engineering so the agent weighs multiple tool outputs appropriately.

Troubleshooting 4 — Reducing external-API latency

  • Local embedding — a SentenceTransformer-based embedding layer is deployed locally, minimizing reliance on external API calls.
  • Optimized initial judgment — if the LLM can make a first-level decision immediately, the system handles it directly (fast path) without invoking the full ReAct graph; only ambiguous cases go to the ReAct tools.