Docker Container Log Analysis

Covers systematic techniques for tracking and analyzing the causes of intermittent errors occurring in Docker containers in a production environment.

docker log analysisdocker logs filteringcontainer error trackingdocker compose logsDocker log driverJSON log parsingcontainer debuggingdocker logs since

Problem

Intermittent 500 errors are occurring in Docker containers running in production. They only happen during certain time periods and are hard to reproduce, and since multiple microservices are connected via docker compose, it is difficult to identify which container the problem originates from. Container logs are accumulating rapidly, making it impractical to manually find the cause due to the sheer volume, and JSON-formatted logs are mixed with plain text logs. You need a systematic log analysis approach to find the root cause of the errors.

Required Tools

Docker CLI

Container management commands including docker logs, docker inspect, etc.

grep / awk

Text-based log filtering and pattern matching tools

Command-line JSON processor for parsing and filtering JSON-formatted logs

docker compose

Multi-container environment management and unified log viewing

Solution Steps

Basic docker logs Usage and Container Status Check

The first step in troubleshooting is checking the container status and quickly reviewing recent logs. Understanding the basic options of the docker logs command allows you to quickly identify the cause in most situations. If a container has been restarted, you should also check the logs from the previous instance; the --previous flag supports this.

# List running containers and their status
docker ps --format "table {{.Names}}\t{{.Status}}\t{{.Ports}}"

# Check detailed status of a specific container (restart count, etc.)
docker inspect --format='{{.RestartCount}} restarts, State: {{.State.Status}}' my-api-server

# View the last 100 lines of logs
docker logs --tail 100 my-api-server

# Stream logs in real-time (Ctrl+C to stop)
docker logs -f my-api-server

# View logs with timestamps
docker logs -t --tail 50 my-api-server

# View last logs from the pre-restart container (check crash cause)
docker logs --previous my-api-server

# Separate stdout and stderr
docker logs my-api-server 2>/dev/null    # stdout only
docker logs my-api-server 2>&1 1>/dev/null  # stderr only

Narrow Down the Scope with Time-Based Filtering

If you know when the errors occurred, use the --since and --until options to extract only the logs from that time period. Supported time formats include RFC 3339 (2024-01-15T09:00:00), Unix timestamps, and relative time (10m, 2h). Extracting only a specific time range from a large volume of logs significantly reduces the analysis scope.

# Logs from the last 30 minutes
docker logs --since 30m my-api-server

# Logs from the last 2 hours
docker logs --since 2h my-api-server

# Logs within a specific time range (error occurrence window)
docker logs --since "2024-01-15T09:00:00" --until "2024-01-15T09:30:00" my-api-server

# Query by Unix timestamp
docker logs --since 1705280400 --until 1705282200 my-api-server

# Time filter + extract errors only (most commonly used combination)
docker logs --since 1h my-api-server 2>&1 | grep -i "error\|exception\|fatal\|panic"

# Analyze error frequency (per minute)
docker logs --since 6h -t my-api-server 2>&1 | grep -i error | awk '{print substr($1,1,16)}' | sort | uniq -c | sort -rn

# Check context around errors (-A: after, -B: before)
docker logs --since 1h my-api-server 2>&1 | grep -B 5 -A 10 "500 Internal Server Error"

Analyze Error Patterns and Statistics with grep/awk

Extracting error patterns from logs and generating statistics helps you prioritize the most frequent issues. Use grep to filter patterns and the awk/sort/uniq combination to generate statistics. Tracking multiple error types simultaneously also helps identify correlations.

# Error count by error type
docker logs --since 24h my-api-server 2>&1 | \
  grep -oP '(Error|Exception|Warning): [^\n]+' | \
  sort | uniq -c | sort -rn | head -20

# Statistics by HTTP status code (Nginx/Express logs)
docker logs --since 24h my-nginx 2>&1 | \
  awk '{print $9}' | \
  sort | uniq -c | sort -rn

# Time distribution of a specific error message
docker logs --since 24h -t my-api-server 2>&1 | \
  grep "ECONNREFUSED" | \
  awk '{print substr($1,12,5)}' | \
  sort | uniq -c

# Extract slow requests (response time > 5000ms)
docker logs --since 6h my-api-server 2>&1 | \
  grep -P 'response_time[=:]s*\d+' | \
  awk -F'response_time[=:]' '{if($2+0 > 5000) print $0}'

# Request count by IP (DDoS/load analysis)
docker logs --since 1h my-nginx 2>&1 | \
  awk '{print $1}' | \
  sort | uniq -c | sort -rn | head -10

# Save error trend to file
docker logs --since 24h -t my-api-server 2>&1 | \
  grep -i "error\|exception" > /tmp/errors_24h.log
echo "Error log saved: $(wc -l < /tmp/errors_24h.log) lines"

Parse and Analyze JSON Format Logs with jq

Structured JSON logs enable much more precise analysis. jq is a powerful tool for filtering, transforming, and aggregating JSON data. Setting the log driver to json-file causes Docker to automatically record logs in JSON format; if the application itself outputs JSON logs, you may need to parse nested JSON.

# Check Docker log driver
docker inspect --format='{{.HostConfig.LogConfig.Type}}' my-api-server

# Extract only error level entries from JSON logs
docker logs --since 6h my-api-server 2>&1 | \
  jq -r 'select(.level == "error") | "\(.timestamp) [\(.level)] \(.message)"'

# Filter by specific fields
docker logs --since 1h my-api-server 2>&1 | \
  jq -r 'select(.status_code >= 500) | "\(.timestamp) \(.method) \(.path) \(.status_code) \(.response_time)ms"'

# Group error messages and count
docker logs --since 24h my-api-server 2>&1 | \
  jq -r 'select(.level == "error") | .message' | \
  sort | uniq -c | sort -rn | head -10

# Calculate average response time
docker logs --since 1h my-api-server 2>&1 | \
  jq -s '[.[] | select(.response_time != null) | .response_time] | (add / length | floor)'

# Trace a specific user's requests (tracing by request_id)
docker logs --since 2h my-api-server 2>&1 | \
  jq -r 'select(.request_id == "abc-123-def") | "\(.timestamp) \(.message)"'

# Format error logs for readability
docker logs --since 30m my-api-server 2>&1 | \
  jq 'select(.level == "error")' | \
  jq -r '"\(.timestamp | split("T")[1] | split(".")[0]) [\(.service)] \(.message)\n  Stack: \(.stack // "N/A")\n"'

Unified Multi-Container Log Analysis with docker compose

In a microservices environment, you need to examine logs from multiple containers simultaneously to trace the propagation path of a problem. The docker compose logs command provides unified log viewing across all services, distinguished by service name prefixes. Setting up a log collection system (ELK, Loki, etc.) enables long-term analysis and dashboard creation.

# View all service logs combined
docker compose logs --since 30m

# View logs for specific services only
docker compose logs --since 1h api-server database redis

# Stream logs in real-time (all services)
docker compose logs -f --tail 20

# Compare error counts by service
for service in $(docker compose ps --services); do
  count=$(docker compose logs --since 1h "$service" 2>&1 | grep -ci "error")
  echo "$service: $count errors"
done

# Trace error propagation path (sorted by time)
docker compose logs --since 30m -t 2>&1 | \
  grep -i "error\|exception\|timeout\|refused" | \
  sort -t'|' -k1 | \
  head -50

# Check log size per container (disk usage)
for container in $(docker ps -q); do
  name=$(docker inspect --format='{{.Name}}' "$container" | sed 's/\///')
  log_path=$(docker inspect --format='{{.LogPath}}' "$container")
  size=$(du -sh "$log_path" 2>/dev/null | awk '{print $1}')
  echo "$name: $size"
done

# Export logs to file (for analysis)
docker compose logs --since 24h --no-color > /tmp/all_logs_24h.txt
echo "All logs saved: $(wc -l < /tmp/all_logs_24h.txt) lines"

# Also check container resource usage (OOM, etc.)
docker stats --no-stream --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}\t{{.MemPerc}}"

Core Code

Reusable Docker log analysis shell script - Check container status, error statistics, and time distribution all at once

#!/bin/bash
# docker-log-analyzer.sh
# Docker container log analysis script

CONTAINER=${1:?"Usage: $0 <container-name> [time-range]"}
SINCE=${2:-"1h"}

echo "=== Docker Log Analysis: $CONTAINER (last $SINCE) ==="
echo ""

# 1. Container status
echo "[1] Container Status"
docker inspect --format='Status: {{.State.Status}} | Restarts: {{.RestartCount}} | Started: {{.State.StartedAt}}' "$CONTAINER"
echo ""

# 2. Log statistics
TOTAL=$(docker logs --since "$SINCE" "$CONTAINER" 2>&1 | wc -l)
ERRORS=$(docker logs --since "$SINCE" "$CONTAINER" 2>&1 | grep -ci "error\|exception\|fatal")
WARNINGS=$(docker logs --since "$SINCE" "$CONTAINER" 2>&1 | grep -ci "warn")
echo "[2] Log Statistics"
echo "  Total: $TOTAL lines | Errors: $ERRORS | Warnings: $WARNINGS"
echo ""

# 3. Top 10 Error Types
echo "[3] Top 10 Error Types"
docker logs --since "$SINCE" "$CONTAINER" 2>&1 | \
  grep -i "error\|exception" | \
  sed 's/[0-9]\{4\}-[0-9]\{2\}-[0-9]\{2\}[T ][0-9:.]*/[TIME]/g' | \
  sort | uniq -c | sort -rn | head -10
echo ""

# 4. Error Distribution by Time
echo "[4] Error Distribution by Time"
docker logs --since "$SINCE" -t "$CONTAINER" 2>&1 | \
  grep -i "error" | \
  awk '{print substr($1,12,5)}' | \
  sort | uniq -c
echo ""

# 5. Last 5 Errors (detailed)
echo "[5] Last 5 Errors"
docker logs --since "$SINCE" "$CONTAINER" 2>&1 | \
  grep -i "error\|exception" | tail -5
echo ""

echo "=== Analysis Complete ==="

Common Mistakes

Logs are lost because the log driver is not configured

Docker's default log driver is json-file, and without setting max-size and max-file options, logs accumulate indefinitely. Set max-size: "10m" and max-file: "3" under logging.options in your docker-compose.yml.

Not checking previous logs after container restart

When a container restarts due to a crash, the current logs may not contain the point of failure. Use the docker logs --previous option to check the last logs from the previous instance. In case of OOM Kill, also check dmesg or /var/log/syslog.

Log files exceed disk capacity causing server failure

Without log rotation configured, production server disks can fill up. Set global log settings (log-driver, log-opts) in daemon.json, or add per-service logging configuration in docker-compose.yml.

Not accounting for the difference between UTC and local timezone

Docker log timestamps are in UTC by default. When analyzing error times in your local timezone, always account for the time difference. Setting the TZ environment variable in the container can align the application log timezone.

Related liminfo Services

Docker Reference JSON Formatter