← Projects

Redis — In-Memory Middleware

One line — Toward the end of the KT DS cloud-service work, I brought Redis in as the in-memory middleware layer: caching, session store and distributed lock on the client (Spring Boot) side, and an operation model — replication · sharding · Sentinel / Cluster — on the server side. This page distills the engineering manual I wrote so the team could adopt it.

Period
2023 · KT DS — Cloud service
Role
Backend & middleware engineering · Spring–Redis integration
Scope
Server-side operation + client-side Spring Boot integration
Deliverable
Internal Redis manual (Redis 7.2), used as the team reference

Why Redis — and where it fits

Redis (REmote DIctionary Server) is an open-source, in-memory key–value NoSQL store. Because it keeps data in memory it is extremely fast (~100K ops/s), but the same memory residency makes it volatile — so clustering and backup are non-negotiable for production. That speed-vs-volatility trade-off is exactly why it shines for short-lived, frequently-touched data: caching, session, message queue (Streams), Pub/Sub, and distributed locks. I integrated it at the application level through Spring Boot rather than driving it by hand with redis-cli.

Two integration patterns

Look-aside Cache App (client) Redis DB ① hit? ② miss → DB read Redis before DB Write-back App (client) Redis (buffer) DB ① write all ② batch insert 1000×1 ≫ 1×1000 buffer writes, flush on a schedule
Look-aside (read through cache) vs Write-back (buffer writes, then batch-insert) — one big insert beats a thousand small ones

Server-side — operating Redis

Speed comes from living in RAM, so the whole operations story is really about protecting that memory: bounding it, replicating it for failover, and splitting it across servers when one box runs out.

⊙ Memory management

Redis is fast precisely because it reads/writes memory — but if physical memory fills up and the OS starts swapping, every access turns into disk I/O and throughput collapses. So I treat memory as the primary operational signal:

  • Set maxmemory deliberately, based on the service. Past the limit, Redis evicts according to its eviction policy (below).
  • Keep each instance ≤ ½ of server RAM — this leaves headroom for the copy-on-write spike during replication (explained next).
  • Memory is managed in pages/frames, so storing wildly different value sizes causes fragmentation — actual footprint exceeds the stored data. Keeping similarly-sized values together helps.
  • Watch it continuously (top, plus the INFO metrics further down).
maxmemory-policyWhat it evicts when full
noevictionNothing — returns an OOM error on writes
allkeys-lruLeast-recently-used key, across all keys
volatile-lruLRU key, only among keys with a TTL
volatile-ttlShortest-TTL key first, among keys with a TTL
allkeys-randomA random key, across all keys
volatile-randomA random key, among keys with a TTL

⊙ Replication — HA / disaster recovery

Redis is a process; it can die at any time. Replication keeps synchronized copies — an Origin (Master) plus Replicas (Slaves) forming a Replica Set. It uses asynchronous replication: the Origin fork()s a child process so it can keep serving while the child ships data to replicas.

Origin (Master) serves traffic Child (fork) RDB snapshot Replica (Slave) Replica (Slave) fork send RDB → load New commands during the snapshot are queued in a replication buffer and replayed afterwards. Statement-level replication.
Full synchronization — fork a child, snapshot to RDB, ship to replicas, then replay buffered commands
  • Full sync — used on first set-up or when replicas drift too far: fork → RDB snapshot → transfer → replicas load it; commands arriving meanwhile are buffered and replayed.
  • Partial resync — for brief disconnects: each side tracks a run_id + replication offset, and the Origin replays only the missing range from its backlog_buffer. If the backlog overflows (or either side restarts), it falls back to a full sync.
  • The COW gotcha: fork() shares parent pages copy-on-write; writes during sync allocate fresh pages, so a 32 GB instance can momentarily need up to ~64 GB. This is why I prefer many small instances (8 GB × 8) over a few big ones (32 GB × 2), and cap each at ≤ ½ of host RAM.

⊙ Sharding — scaling out

Replication is about availability, not capacity. One instance has finite RAM; when you can't scale up, you scale out by distributing data across servers — horizontal partitioning, called sharding in the NoSQL world. (Unlike DBMS partitioning, which divides within one server, sharding divides at the server boundary.) The real question is how you split the keys:

StrategyHowTrade-off
RangeSplit by key range (e.g. dates)Simple, but uneven load → hot shards
Modularhash(key) % N serversEven, but adding a node triggers heavy rebalancing (grow ×2 to halve it)
IndexA central index server places/locates dataBalanced, but the index is an extra failure point
Consistent hashingHash both keys and servers onto a ringOnly the affected slice rebalances on add/remove
Consistent hashing — a key maps to the next server clockwise S1 (10000) S2 (20000) S3 (30000) key 9400 → S1 key 12000 → S2 key 28500 → S3
Consistent hashing localizes rebalancing — if S2 leaves, only its keys move and the rest stay put

⊙ Operation modes — Sentinel & Cluster

Rather than wiring replication and failover by hand, Redis ships two managed modes that package these mechanisms for operations.

Sentinel — HA (auto-failover) Master ✗ Replica↑ Replica Replica 3 Sentinels watch & vote quorum agrees → promote a replica monitoring · failover · alarm Cluster — HA + Sharding Primary #1 · slot 0–5460 Primary #2 · slot 5461–10922 Primary #3 · slot 10923–16383 crc16(key) % 16384 → slot → primary each primary has a secondary · auto-failover
Sentinel adds automatic failover to replication; Cluster adds sharding on top via a 16,384-slot hash space
  • Sentinel (HA): Sentinel processes ride alongside each Redis, monitor the nodes, and on master failure hold a quorum vote to promote a replica; they can also alert via Pub/Sub or shell scripts. Needs ≥ 3 nodes so the vote has a majority. A recovered master rejoins as a replica.
  • Cluster (HA + sharding): slots 0–16383 are divided across primaries; an incoming key is hashed with crc16 then % 16384 to pick its slot/primary. If a key lands on the wrong node it replies -MOVED and the client redirects. Secondaries auto-promote on failure (a different mechanism from Sentinel). Cost: the client needs slot-aware logic and the cluster carries slot-management overhead.
ModeHigh availabilitySharding
Stand-alone
Sentinel✓ (auto-failover)
Cluster

⊙ Monitoring

Redis is single-threaded, so for CPU you scale the per-core speed, not the core count. The headline signals from INFO:

# Memory used_memory_human:916.98K # what Redis uses used_memory_rss_human:6.63M # RSS — physical memory, OS view maxmemory_human:0B maxmemory_policy:noeviction # CPU (single-thread → raise core speed, not count) used_cpu_sys:853.71 used_cpu_user:923.68 # Replication role:master connected_slaves:0 master_repl_offset:0

Client-side — Spring Boot integration

On the application side Redis becomes a client interface problem. With spring-boot-starter-data-redis the connection auto-configures (Lettuce by default; Redisson is the other common client), and a small YAML block wires host / port / password and the Lettuce pool. From there I used three layers — caching, session, distributed lock.

dependencies { implementation 'org.springframework.boot:spring-boot-starter-data-redis' implementation 'org.springframework.session:spring-session-data-redis' implementation 'org.springframework.boot:spring-boot-starter-cache' } spring: redis: host: { redis-server ip } port: { redis-server port } password: { redis-server password } lettuce: pool: { max-active: 5, max-idle: 5, min-idle: 2 }

⊙ Caching — why, and three ways

If each WAS caches in its own JVM heap, the instances have to sync their caches with each other. Put the cache in Redis and that disappears — every instance just looks at one place.

Before — local cache per WAS WAS 1 WAS 2 WAS 3 cache sync between every node ✗ After — Redis as shared cache WAS 1 WAS 2 WAS 3 Redis (1 source) no inter-node sync needed
Moving the cache out of per-WAS heaps into Redis removes the cross-instance sync cost

1 · Spring Data Repository — a CrudRepository over a @RedisHash domain object; objects map to Redis hashes, @Id/@Indexed become lookup keys, and @TimeToLive sets expiry. Feels just like JPA.

@RedisHash(value = "people", timeToLive = 1800) public class Person implements Serializable { @Id private Long id; @Indexed private String name; private Integer age; } public interface PersonRedisRepository extends CrudRepository<Person, Long> { Optional<Person> findPersonByName(String name); // save / findById / deleteById come for free }

2 · Caching annotations — wrap a service method in a caching proxy: @Cacheable on reads, @CachePut on updates, @CacheEvict on deletes. You register a CacheManager with a serializer and a default TTL (else redis-cli shows opaque serialized bytes). Note: an updating method must return the value so the proxy can cache it.

@Cacheable(key = "#name", value = "personCache") public Person findPersonByName(String name) { return repo.getByName(name); } @CachePut(key = "#person.name", value = "personCache") public Person update(Person person) { return repo.updateByName(person); } // must return! @CacheEvict(key = "#name", value = "personCache") public void delete(String name) { repo.deleteByName(name); } // → key in redis: "personCache::kim"

3 · RedisTemplate — the low-level operator when you want a specific data structure directly: opsForValue / opsForList / opsForSet / opsForZSet / opsForHash, plus expire, keys, rename, delete. The catch I flagged for the team: even with List/Hash/Set you still address everything by the Redis key — don't confuse a hash's inner field with the Redis key — and RedisCommandExecutionException: ERR no such key on rename means you need solid error handling.

⊙ Session — Spring Session

Same story as caching: WAS-local sessions force sticky sessions or session sync across the cluster. Backing Spring Session with Redis centralizes it. Config goes in YAML — and one trap I documented: adding @EnableRedisHttpSession makes the annotation's attributes win and your YAML session settings stop taking effect.

server: { servlet: { session: { timeout: 60 } } } # seconds spring: session: store-type: redis redis: namespace: spring:session:admin # key prefix flush-mode: on_save # on_save vs immediate cleanup-cron: 0/5 * * * * * # sweep expired remnants @Bean // avoid rune-like bytes in redis RedisSerializer<Object> springSessionDefaultRedisSerializer() { return new GenericJackson2JsonRedisSerializer(); }

flush-mode is the subtle one: on_save writes the session only when the response is built, while immediate writes the moment you setAttribute/getAttribute — a behavioral difference you only catch with a deliberate Thread.sleep test before the response returns.

⊙ Distributed lock

To stop two processes/threads from touching a shared resource at once, I implemented a spin lock on Redis with setIfAbsent (atomic "set if not exists") plus a TTL so a crashed holder can't deadlock the resource; unlock is just a delete.

public Boolean lock(Long key) { return redisTemplate.opsForValue() .setIfAbsent(String.valueOf(key), "lock", Duration.ofMillis(10000L)); } public Boolean unlock(Long key) { return redisTemplate.delete(String.valueOf(key)); } // usage — spin until acquired, always release in finally while (!redisLockRepository.lock(id)) { Thread.sleep(300); } // sleep cushions Redis load try { redisCRUDService.create(p1); } finally { redisLockRepository.unlock(id); }

The spin lock's weakness is the constant re-polling load on Redis; the Thread.sleep(300) is the deliberate buffer that softens it.


What I took away

  • Redis is a memory-operations problem first. Most production risk traces back to memory — eviction policy, the copy-on-write spike on fork, and fragmentation — so I size instances to ≤ ½ host RAM and prefer many small ones.
  • Pick the mode for the need: Sentinel when you only want HA, Cluster when you also need to shard — and remember Cluster pushes slot-awareness onto the client.
  • On the app side, the right abstraction matters: repository / annotations for caching, Spring Session for session, a TTL-guarded setIfAbsent for locking — each with its own serializer and its own subtle config trap.