← Documents

API Gateway Solution (BEAST)

One line — A single API Gateway that unifies many different provider APIs (Netflix, Disney, Kakao …) under one authentication / routing layer. Now the core engine of KT API-Link, reliably handling 50M+ requests/day.

Period
March 2022 ~ December 2023
Role
Backend Server Developer · Performance Optimization
Scope
Solutions for both internal and external use

What it does

The solution consists of an API Gateway Core System and a back-office portal for monitoring and managing APIs. It unifies different API interfaces and centralizes authentication / authorization, collapsing a complex integration structure into a single gateway.

Clients Netflix / OTT Kakao Partners … API Gateway (BEAST) Auth / AuthZ Routing Handlers (JSON ↔ XML, transform) Backend APIs Provider A Provider B Provider … Back-office portal (monitoring · 관리) 50M+ requests / day
One gateway in front of many provider APIs — unified auth, routing and request transformation

Impact

  • Unified different API interfaces (e.g., Netflix, Disney) and centralized authentication / authorization, simplifying a complex integration structure into a single gateway.
  • Integrated into KT API-Link, where it now operates as the core engine of the API integration platform, reliably handling over 50 million requests per day.

Live service: KT API-Link.

Tech stack

API GatewayHandlerRESTful APIProtocol / Network JavaSpring BootSpring WebFluxSpring Cloud PostgreSQLMongoDBJPAR2DBCReactive MongoDB Linux (Ubuntu)AWSOn-premise

Key roles & achievements

The project was carried out in three phases over two years.

Phase 1 Back-Office Portal Phase 2 KT API-Link Integration Phase 3 WebFlux / Reactive Core
Three sequential phases over two years

⊙ Phase 1 — Back-Office Portal

  • Dashboard: real-time monitoring of API status and performance.
  • API Management: registration, modification, deletion and lifecycle management.
  • Data Collection & Aggregation: scheduled periodic log aggregation.
  • Community Support: announcements and user-community features.

⊙ Phase 2 — System Integration (KT API-Link)

  • Integrated the gateway into KT API-Link and improved system performance.
  • Query Performance Optimization: composite indexes and optimized data structures.
  • Handler Management: customizable deployment and configuration.
  • Security: adjustments for infrastructure such as firewall environments (added bypass functions).

⊙ Phase 3 — Core Enhancement & WebFlux Migration

  • Migrated to a Reactive REST API with WebFlux for high-performance processing.
  • Non-blocking ORM: applied R2DBC and Reactive MongoDB.
  • Thread Pool Optimization: resolved blocking I/O by leveraging Netty’s event-loop and offloading async tasks to separate thread pools (TaskScheduler).
  • Performance Testing: TPS / response-time analysis with Apache Bench, then optimization.

Troubleshooting — Thread-pool exhaustion bottleneck

Issue: After certain APIs were registered, request processing slowed dramatically — some requests were delayed for tens of seconds or never responded. The cause and the responsible API were initially unclear, causing continuous response delays across the whole system.

Root cause: The gateway lets you insert custom handlers at intermediate stages (JSON↔XML conversion, format transformation, external API calls). Many handlers made blocking calls to external providers. When an external response was slow, the blocking code held Netty Worker Threads for a long time → EventLoop exhaustion → throughput collapse. JPA-based DB access added further small blocking.

Before — blocking on EventLoop Netty EventLoop Worker Threads (I/O) blocking handler → external API ⏳ threads held → exhaustion requests pile up / time out After — offload + reactive Netty EventLoop — pure I/O only Dedicated pool (TaskScheduler) — blocking ops R2DBC · Reactive MongoDB (non-blocking) stable high throughput
Blocking work moved off the EventLoop into a dedicated pool; DB access made reactive

Resolution:

  • Separated blocking operations into dedicated thread pools (e.g., TaskScheduler), letting the main EventLoop do pure I/O only.
  • Converted blocking calls to multi-threaded asynchronous execution, so one slow handler no longer affects the whole system.
  • Migrated DB operations (e.g., log insertion) to Reactive MongoDB and R2DBC, minimizing blocking.
  • After restructuring, the gateway reliably handled high traffic without degradation, preventing recurrence of operational incidents.