Centralized Logging with ELK Stack (Elasticsearch, Logstash, Kibana)
Modern distributed microservices architectures generate enormous volumes of logs every second. In enterprise-scale systems, logs are produced by API gateways, authentication services, payment systems, Kafka consumers, Kubernetes clusters, databases, monitoring agents, security systems, and cloud infrastructure components.
Without centralized logging, troubleshooting production issues becomes extremely difficult. Engineers waste time logging into multiple servers, manually searching log files, correlating timestamps, and identifying failures across distributed systems.
The ELK Stack โ Elasticsearch, Logstash, and Kibana โ solves this problem by providing a centralized platform for collecting, processing, storing, searching, analyzing, and visualizing logs from distributed applications.
In modern Spring Boot microservices environments, centralized logging is not optional. It is a critical operational requirement for debugging, monitoring, security auditing, observability, incident response, compliance, and production support.
This guide provides a production-grade deep dive into implementing centralized logging with the ELK Stack for Spring Boot microservices architectures. You will learn architecture design, internal workflows, Spring Boot integration, structured JSON logging, Docker deployment, Logstash pipelines, Elasticsearch indexing, Kibana dashboards, distributed tracing correlation, performance optimization, security hardening, operational best practices, and enterprise troubleshooting strategies.
Table of Contents
- What You Will Learn
- What is Centralized Logging
- Why Centralized Logging is Important
- Introduction to the ELK Stack
- Understanding Elasticsearch
- Understanding Logstash
- Understanding Kibana
- ELK Stack Architecture Overview
- Logging Challenges in Microservices
- Setting Up ELK Stack with Docker
- Creating a Spring Boot Logging Project
- Structured JSON Logging
- Configuring Logback for JSON Logs
- Sending Logs to Logstash
- Logstash Pipeline Configuration
- Understanding Elasticsearch Indexing
- Visualizing Logs with Kibana
- Distributed Tracing and Log Correlation
- Monitoring and Alerting
- Security Best Practices
- Performance Optimization
- Common Production Problems
- Real World Enterprise Architecture
- Troubleshooting Logging Issues
- Interview Questions and Answers
- Frequently Asked Questions
- Summary
- Next Learning Recommendations
What You Will Learn
- Centralized logging fundamentals
- ELK Stack architecture
- Elasticsearch internals
- Logstash pipeline configuration
- Kibana visualization techniques
- Spring Boot structured logging
- JSON log formatting
- Distributed log correlation
- Observability best practices
- Production monitoring strategies
- Security hardening techniques
- Performance optimization methods
- Enterprise troubleshooting workflows
- Operational best practices
What is Centralized Logging
Centralized logging is the process of collecting logs from multiple applications, servers, containers, and infrastructure components into a single searchable platform.
Simple Definition
Centralized logging aggregates logs from distributed systems into one location for monitoring, debugging, and analysis.
Without Centralized Logging
Server A Logs Server B Logs Server C Logs Container Logs Database Logs Kubernetes Logs Manual Searching Everywhere
With Centralized Logging
All Services
|
v
Centralized ELK Stack
|
v
Single Searchable Dashboard
Why Centralized Logging is Important
Modern microservices environments may contain hundreds of distributed services. Each service generates logs independently.
Major Problems Without Centralized Logging
- Difficult debugging
- No distributed visibility
- Slow incident response
- Manual log correlation
- Poor operational monitoring
- Security audit challenges
- Compliance difficulties
Benefits of Centralized Logging
- Unified observability
- Faster troubleshooting
- Real-time monitoring
- Security auditing
- Distributed tracing support
- Production analytics
- Operational intelligence
Introduction to the ELK Stack
The ELK Stack consists of:
- Elasticsearch
- Logstash
- Kibana
ELK Workflow
Applications
|
v
Logstash
|
v
Elasticsearch
|
v
Kibana Dashboard
Component Responsibilities
| Component | Purpose |
|---|---|
| Elasticsearch | Search and storage engine |
| Logstash | Log ingestion and processing |
| Kibana | Visualization and dashboards |
Understanding Elasticsearch
Elasticsearch is a distributed search and analytics engine built on Apache Lucene.
Core Features
- Distributed indexing
- Real-time search
- Horizontal scalability
- Full-text search
- High availability
- Aggregation support
Elasticsearch Architecture
Elasticsearch Cluster
|
+------+------+------+
| | |
v v v
Node1 Node2 Node3
Important Concepts
- Index
- Document
- Shard
- Replica
- Cluster
- Node
Understanding Logstash
Logstash is a data processing pipeline used to collect, transform, enrich, and forward logs.
Logstash Pipeline Stages
Input | v Filter | v Output
Responsibilities
- Read logs
- Parse messages
- Transform data
- Enrich metadata
- Forward to Elasticsearch
Understanding Kibana
Kibana is the visualization layer of the ELK Stack.
Capabilities
- Log search
- Dashboards
- Visual analytics
- Error monitoring
- Operational insights
- Security analytics
Kibana Dashboard Example
+----------------------+ | Error Rate Dashboard | +----------------------+ +----------------------+ | API Response Trends | +----------------------+ +----------------------+ | Security Alerts | +----------------------+
ELK Stack Architecture Overview
Spring Boot Services
|
v
JSON Logs
|
v
Logstash
|
v
Elasticsearch Cluster
|
v
Kibana Dashboards
Enterprise Flow
- Application generates structured logs
- Logs are forwarded to Logstash
- Logstash parses and enriches logs
- Elasticsearch indexes logs
- Kibana visualizes searchable logs
Logging Challenges in Microservices
Distributed Requests
A single user request may travel through multiple services.
Containerized Environments
Containers are ephemeral and logs may disappear.
Massive Log Volume
Enterprise systems generate terabytes of logs daily.
Correlation Complexity
Identifying related logs across services is difficult.
Setting Up ELK Stack with Docker
docker-compose.yml
version: '3'
services:
elasticsearch:
image:
docker.elastic.co/elasticsearch/elasticsearch:8.12.0
environment:
discovery.type: single-node
ports:
- "9200:9200"
logstash:
image:
docker.elastic.co/logstash/logstash:8.12.0
ports:
- "5000:5000"
depends_on:
- elasticsearch
kibana:
image:
docker.elastic.co/kibana/kibana:8.12.0
ports:
- "5601:5601"
depends_on:
- elasticsearch
Start ELK Stack
docker-compose up -d
Creating a Spring Boot Logging Project
Required Dependencies
- Spring Boot Starter Web
- Spring Boot Actuator
- Logback Encoder
Maven Dependency
<dependency>
<groupId>
net.logstash.logback
</groupId>
<artifactId>
logstash-logback-encoder
</artifactId>
<version>7.4</version>
</dependency>
Structured JSON Logging
Structured logging stores logs in machine-readable JSON format.
Why JSON Logging Matters
- Easy parsing
- Searchable fields
- Efficient indexing
- Better analytics
- Distributed correlation
Traditional Log Example
2026-05-29 INFO Order Created
Structured JSON Log Example
{
"timestamp":"2026-05-29",
"level":"INFO",
"service":"order-service",
"traceId":"abc123",
"message":"Order Created"
}
Configuring Logback for JSON Logs
logback-spring.xml
<configuration>
<appender
name="LOGSTASH"
class=
"net.logstash.logback.appender.LogstashTcpSocketAppender">
<destination>
localhost:5000
</destination>
<encoder
class=
"net.logstash.logback.encoder.LogstashEncoder">
</encoder>
</appender>
<root level="INFO">
<appender-ref ref="LOGSTASH"/>
</root>
</configuration>
Benefits
- Automatic JSON formatting
- Centralized log shipping
- Structured metadata
- Search optimization
Sending Logs to Logstash
Applications send logs directly to Logstash over TCP.
Log Flow
Spring Boot App
|
v
TCP Log Stream
|
v
Logstash
Logstash Pipeline Configuration
logstash.conf
input {
tcp {
port => 5000
codec => json
}
}
filter {
mutate {
add_field => {
"environment" => "production"
}
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "spring-logs-%{+YYYY.MM.dd}"
}
}
Pipeline Explanation
- Input receives logs
- Filter enriches data
- Output sends logs to Elasticsearch
Understanding Elasticsearch Indexing
Elasticsearch stores logs inside indexes.
Index Example
spring-logs-2026.05.29
Why Daily Indexes Matter
- Efficient retention
- Better performance
- Simpler archival
- Easier cleanup
Index Architecture
Elasticsearch Index
|
+------+------+------+
| | |
v v v
Shard1 Shard2 Shard3
Visualizing Logs with Kibana
Kibana Features
- Search logs
- Create dashboards
- Visualize trends
- Monitor errors
- Track response times
Useful Dashboards
- Error rate dashboard
- API latency dashboard
- Authentication failure dashboard
- Kafka consumer lag dashboard
- Database error dashboard
Distributed Tracing and Log Correlation
Distributed tracing allows correlating logs across multiple services.
Trace Flow
Client Request
|
v
Gateway Service
|
v
Order Service
|
v
Payment Service
Using Trace IDs
Each request receives a unique trace ID.
All services include the trace ID in logs.
Benefits
- End-to-end request tracking
- Performance analysis
- Distributed debugging
- Root cause analysis
Related topic:
Monitoring and Alerting
Critical Alerts
- High error rates
- Authentication failures
- Service crashes
- Kafka consumer failures
- Database connection issues
Monitoring Architecture
Applications
|
v
ELK Stack
|
v
Alerts & Dashboards
Enterprise Recommendation
Combine ELK with Prometheus and Grafana for full observability.
Security Best Practices
Protect Sensitive Information
- Never log passwords
- Mask credit card numbers
- Redact tokens
- Hide personal information
Enable Authentication
Secure Kibana and Elasticsearch access.
Use TLS Encryption
Encrypt log traffic between components.
Implement Role-Based Access
Restrict operational log access.
Performance Optimization
Reduce Excessive Logging
Too many logs increase storage and processing costs.
Use Appropriate Log Levels
| Level | Purpose |
|---|---|
| DEBUG | Development troubleshooting |
| INFO | Business events |
| WARN | Potential problems |
| ERROR | Failures |
Optimize Elasticsearch
- Use index lifecycle policies
- Configure shard counts carefully
- Use SSD storage
- Archive old logs
Common Production Problems
Huge Log Volumes
Excessive logging overwhelms Elasticsearch clusters.
Incorrect JSON Parsing
Malformed logs break ingestion pipelines.
Storage Exhaustion
Large indexes consume massive disk space.
Slow Elasticsearch Queries
Improper indexing reduces search performance.
Unbounded DEBUG Logging
Verbose logging impacts application performance.
Real World Enterprise Architecture
Enterprise Microservices Logging Architecture
API Gateway
Order Service
Payment Service
Inventory Service
Notification Service
Kafka Brokers
Kubernetes Nodes
Database Servers
|
v
Centralized Log Pipeline
|
v
Logstash Cluster
|
v
Elasticsearch Cluster
|
v
Kibana Dashboards
Enterprise Features
- Distributed tracing
- Structured JSON logs
- Security auditing
- Alerting systems
- High availability clusters
- Disaster recovery
- Log archival policies
- Compliance reporting
Troubleshooting Logging Issues
Logs Not Appearing in Kibana
- Verify Logstash is running
- Check Elasticsearch connectivity
- Inspect Logstash pipelines
- Verify index creation
Malformed JSON Errors
- Validate JSON format
- Check special characters
- Review encoder configuration
Elasticsearch Memory Problems
- Increase JVM heap size
- Reduce shard counts
- Archive old indexes
Slow Searches
- Optimize indexes
- Reduce wildcard queries
- Use proper mappings
Interview Questions and Answers
What is centralized logging?
Centralized logging aggregates logs from multiple systems into a unified searchable platform.
What is the ELK Stack?
The ELK Stack consists of Elasticsearch, Logstash, and Kibana.
Why use structured JSON logging?
JSON logs are machine-readable, searchable, and easier to analyze.
What is Elasticsearch?
Elasticsearch is a distributed search and analytics engine.
What is Logstash?
Logstash is a log ingestion and transformation pipeline.
Why is Kibana important?
Kibana provides dashboards, search capabilities, and log visualization.
Frequently Asked Questions
Can ELK Stack handle terabytes of logs?
Yes. ELK is designed for large-scale distributed logging systems.
Why should logs be stored in JSON format?
JSON enables efficient parsing, indexing, and searching.
Can ELK integrate with Kubernetes?
Yes. ELK is widely used in Kubernetes environments.
What is a trace ID?
A trace ID uniquely identifies a distributed request across services.
Should DEBUG logs be enabled in production?
Only temporarily for troubleshooting because DEBUG logs increase overhead.
Can ELK be used for security monitoring?
Yes. ELK is commonly used for audit logging and security analytics.
Summary
Centralized logging is a critical operational capability for modern distributed microservices systems.
The ELK Stack provides a powerful platform for collecting, processing, storing, searching, and visualizing logs from distributed applications.
In this guide, you learned:
- Centralized logging fundamentals
- ELK Stack architecture
- Elasticsearch internals
- Logstash pipeline configuration
- Kibana dashboards and visualization
- Structured JSON logging
- Distributed tracing correlation
- Security best practices
- Performance optimization
- Production troubleshooting strategies
Mastering centralized logging is essential for backend engineers, DevOps engineers, SRE teams, cloud architects, and enterprise platform engineers building production-grade microservices systems.