Post

[Kubernetes] Rollout & Pod/Container μž¬μ‹œμž‘

πŸ“Œ κ°œμš”

ν•œ 쀄 μš”μ•½: Deployment, DaemonSet λ“±μ˜ Podλ₯Ό μ•ˆμ „ν•˜κ²Œ κ΅μ²΄ν•˜λŠ” Rolling Update λ©”μ»€λ‹ˆμ¦˜

핡심 κ°œλ…:

  • Pod μž¬μ‹œμž‘: Pod μ‚­μ œ ν›„ μž¬μƒμ„± (IP λ³€κ²½, AGE μ΄ˆκΈ°ν™”)
  • μ»¨ν…Œμ΄λ„ˆ μž¬μ‹œμž‘: PodλŠ” μœ μ§€, μ»¨ν…Œμ΄λ„ˆλ§Œ μž¬μ‹œμž‘ (RESTARTS 증가)
  • Rollout: spec λ³€κ²½ μ‹œ 순차적으둜 Podλ₯Ό μ•ˆμ „ν•˜κ²Œ ꡐ체

πŸ”„ Pod μž¬μ‹œμž‘ vs μ»¨ν…Œμ΄λ„ˆ μž¬μ‹œμž‘

Pod μž¬μ‹œμž‘ (Pod μ‚­μ œ ν›„ μž¬μƒμ„±)

1
2
3
4
5
6
7
kubectl delete pod <pod-name>
    ↓
Pod μ™„μ „νžˆ μ‚­μ œ
    ↓
μƒˆ Pod 생성 (μƒˆλ‘œμš΄ IP, μƒˆλ‘œμš΄ AGE)
    ↓
kubectl get pods β†’ AGEκ°€ 짧아짐

νŠΉμ§•:

  • Pod IP λ³€κ²½
  • AGE μ΄ˆκΈ°ν™”
  • RESTARTS 0으둜 μ΄ˆκΈ°ν™”
  • μƒˆλ‘œμš΄ λ¦¬μ†ŒμŠ€ ν• λ‹Ή

λ°œμƒ 원인:

  • kubectl delete pod
  • λ…Έλ“œ μž₯μ• 
  • Deployment 이미지 λ³€κ²½
  • kubectl rollout restart

μ»¨ν…Œμ΄λ„ˆ μž¬μ‹œμž‘ (PodλŠ” μœ μ§€)

1
2
3
4
5
6
7
μ»¨ν…Œμ΄λ„ˆ ν”„λ‘œμ„ΈμŠ€ 비정상 μ’…λ£Œ (OOMKill, Crash λ“±)
    ↓
PodλŠ” κ·ΈλŒ€λ‘œ μœ μ§€
    ↓
μ»¨ν…Œμ΄λ„ˆλ§Œ μž¬μ‹œμž‘
    ↓
kubectl get pods β†’ AGE κ·ΈλŒ€λ‘œ, RESTARTS 증가

νŠΉμ§•:

  • Pod IP μœ μ§€
  • AGE μœ μ§€
  • RESTARTS 숫자 λˆ„μ  증가
  • λ™μΌν•œ λ¦¬μ†ŒμŠ€ μž¬μ‚¬μš©

λ°œμƒ 원인:

  • OOMKilled (λ©”λͺ¨λ¦¬ 초과)
  • CrashLoopBackOff (ν”„λ‘œμ„ΈμŠ€ 비정상 μ’…λ£Œ)
  • Liveness Probe μ‹€νŒ¨
  • μ»¨ν…Œμ΄λ„ˆ μ—λŸ¬

λΉ„κ΅ν‘œ

ν•­λͺ©Pod μž¬μ‹œμž‘μ»¨ν…Œμ΄λ„ˆ μž¬μ‹œμž‘
AGE변경됨 (μ΄ˆκΈ°ν™”)κ·ΈλŒ€λ‘œ μœ μ§€
IPλ³€κ²½λ¨κ·ΈλŒ€λ‘œ μœ μ§€
RESTARTS0으둜 μ΄ˆκΈ°ν™”λˆ„μ  증가
원인kubectl delete, λ…Έλ“œ μž₯μ• , rolloutOOMKill, Crash, Probe μ‹€νŒ¨
확인kubectl get pods β†’ AGE 확인kubectl get pods β†’ RESTARTS 확인

🎯 Rollout λ©”μ»€λ‹ˆμ¦˜

κΈ°λ³Έ 원리

1
2
3
4
5
6
7
spec λ³€κ²½ 감지 (이미지 λ³€κ²½, annotation λ³€κ²½ λ“±)
    ↓
updateStrategy에 따라 순차적으둜 Pod ꡐ체
    ↓
μƒˆ Pod Running 확인 ν›„ λ‹€μŒ Pod ꡐ체
    ↓
λͺ¨λ“  Pod ꡐ체 μ™„λ£Œ

rollout restart λ™μž‘

1
kubectl rollout restart daemonset -n kube-system calico-node

λ‚΄λΆ€ λ™μž‘:

1
2
3
4
5
6
7
8
9
10
11
12
1. DaemonSet annotation에 μž¬μ‹œμž‘ μ‹œκ°„ μΆ”κ°€
   kubectl.kubernetes.io/restartedAt: "2025-02-19T21:00:00"
    ↓
2. spec λ³€κ²½μœΌλ‘œ 인식 β†’ Rolling Update 트리거
    ↓
3. updateStrategy에 따라 순차 μž¬μ‹œμž‘
    ↓
4. 각 Pod:
   - κΈ°μ‘΄ Pod μ‚­μ œ
   - μƒˆ Pod 생성
   - Running 확인
   - λ‹€μŒ Pod둜 μ§„ν–‰

πŸ”§ updateStrategy

Deployment

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
spec:
  replicas: 5
  strategy:
    type: RollingUpdate  # κΈ°λ³Έκ°’
    rollingUpdate:
      maxUnavailable: 1  # λ™μ‹œμ— 내릴 수 μžˆλŠ” Pod 수
      maxSurge: 1        # μΆ”κ°€λ‘œ 올릴 수 μžˆλŠ” Pod 수
  template:
    spec:
      containers:
      - name: nginx
        image: nginx:1.21

λ™μž‘ μ˜ˆμ‹œ (replicas: 5 β†’ 이미지 λ³€κ²½):

1
2
3
4
1. μƒˆ Pod 1개 생성 (maxSurge=1) β†’ 총 6개
2. μƒˆ Pod Running 확인
3. κΈ°μ‘΄ Pod 1개 μ‚­μ œ (maxUnavailable=1) β†’ 총 5개
4. 반볡 β†’ λͺ¨λ“  Pod ꡐ체 μ™„λ£Œ

DaemonSet (calico-node)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: calico-node
  namespace: kube-system
spec:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1  # λ™μ‹œμ— μž¬μ‹œμž‘ν•  λ…Έλ“œ 수
      maxSurge: 0        # DaemonSet은 λ…Έλ“œλ‹Ή 1개라 Surge μ—†μŒ
  template:
    spec:
      containers:
      - name: calico-node
        image: calico/node:v3.20

DaemonSet νŠΉμ§•:

  • λ…Έλ“œλ‹Ή Pod 1개만 μ‹€ν–‰
  • maxSurge: 0 (μΆ”κ°€ Pod λΆˆκ°€)
  • maxUnavailable: 1 β†’ ν•œ λ²ˆμ— 1개 λ…Έλ“œμ”© μž¬μ‹œμž‘

updateStrategy νƒ€μž…

typeλ™μž‘μ‚¬μš© μ‹œκΈ°
RollingUpdate순차적으둜 ν•˜λ‚˜μ”© ꡐ체 (κΈ°λ³Έκ°’)일반적인 μ—…λ°μ΄νŠΈ
OnDelete직접 μ‚­μ œν•  λ•Œλ§Œ ꡐ체 (μžλ™ X)μˆ˜λ™ μ œμ–΄ ν•„μš” μ‹œ

πŸ’» Rollout λͺ…λ Ήμ–΄

κΈ°λ³Έ λͺ…λ Ήμ–΄

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# μž¬μ‹œμž‘ (annotation λ³€κ²½ 방식)
kubectl rollout restart deployment web-app -n default
kubectl rollout restart daemonset calico-node -n kube-system

# μ§„ν–‰ 상황 확인
kubectl rollout status deployment web-app -n default
# Waiting for deployment "web-app" rollout to finish: 2 out of 5 new replicas have been updated...
# deployment "web-app" successfully rolled out

# μΌμ‹œ 쀑지 (문제 λ°œμƒ μ‹œ)
kubectl rollout pause deployment web-app -n default

# 재개
kubectl rollout resume deployment web-app -n default

# νžˆμŠ€ν† λ¦¬ 확인
kubectl rollout history deployment web-app -n default
# REVISION  CHANGE-CAUSE
# 1         <none>
# 2         kubectl set image deployment/web-app nginx=nginx:1.22

# 이전 λ²„μ „μœΌλ‘œ λ‘€λ°±
kubectl rollout undo deployment web-app -n default

# νŠΉμ • revision으둜 λ‘€λ°±
kubectl rollout undo deployment web-app -n default --to-revision=1

μ‹€μ‹œκ°„ λͺ¨λ‹ˆν„°λ§

1
2
3
4
5
6
7
8
# Rollout μ§„ν–‰ 상황 μ‹€μ‹œκ°„ 확인
watch kubectl get pods -n default

# DaemonSet λ…Έλ“œλ³„ μƒνƒœ
kubectl get daemonset -n kube-system calico-node -o wide

# 이벀트 확인
kubectl get events -n default --sort-by='.lastTimestamp' | grep web-app

πŸ” μ»¨ν…Œμ΄λ„ˆ μž¬μ‹œμž‘ 원인 뢄석

μž¬μ‹œμž‘ 원인 확인

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# μž¬μ‹œμž‘ 원인 확인
kubectl describe pod <pod-name> -n <namespace> | grep -A 10 "Last State"

# 좜λ ₯ μ˜ˆμ‹œ:
# Last State:     Terminated
#   Reason:       OOMKilled
#   Exit Code:    137
#   Started:      Wed, 19 Feb 2025 10:00:00 +0900
#   Finished:     Wed, 19 Feb 2025 10:05:00 +0900

# 이전 μ»¨ν…Œμ΄λ„ˆ 둜그 확인
kubectl logs <pod-name> -n <namespace> --previous

# RESTARTS 높은 Pod μ°ΎκΈ°
kubectl get pods -A --sort-by='.status.containerStatuses[0].restartCount' | tail -10

μ£Όμš” μž¬μ‹œμž‘ 원인

원인증상확인 λ°©λ²•μ‘°μΉ˜
OOMKilledλ©”λͺ¨λ¦¬ 초과Reason: OOMKilled / Exit Code: 137Memory Limit 증가
CrashLoopBackOffν”„λ‘œμ„ΈμŠ€ 비정상 μ’…λ£ŒReason: Error / Exit Code: 1둜그 확인, μ½”λ“œ μˆ˜μ •
Liveness Probe μ‹€νŒ¨ν—¬μŠ€μ²΄ν¬ μ‹€νŒ¨Liveness probe failedProbe μ„€μ • 확인
λ…Έλ“œ λ¦¬μ†ŒμŠ€ λΆ€μ‘±μ—¬λŸ¬ Pod λ™μ‹œ μž¬μ‹œμž‘kubectl top nodesλ…Έλ“œ λ¦¬μ†ŒμŠ€ 확인

디버깅 μ˜ˆμ‹œ

1
2
3
4
5
6
7
8
# OOMKilled 확인
kubectl describe pod my-app-123 -n default | grep -i oom

# λ¦¬μ†ŒμŠ€ μ‚¬μš©λŸ‰ 확인
kubectl top pod my-app-123 -n default

# 이벀트 확인
kubectl get events -n default --field-selector involvedObject.name=my-app-123

⚠️ μ£Όμ˜μ‚¬ν•­

Rollout κ΄€λ ¨

1. maxUnavailable λ„ˆλ¬΄ 크게 μ„€μ •

1
2
3
4
5
6
7
# ❌ μœ„ν—˜
rollingUpdate:
  maxUnavailable: 5  # replicas 5개 쀑 5개 λ‚΄λ¦Ό β†’ μ„œλΉ„μŠ€ 쀑단

# βœ… μ•ˆμ „
rollingUpdate:
  maxUnavailable: 1  # μ΅œμ†Œ κ°€μš©μ„± 보μž₯

2. Probe λ―Έμ„€μ • μ‹œ 문제

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# ❌ Probe μ—†μŒ
spec:
  containers:
  - name: app
    image: myapp:1.0
    # readinessProbe μ—†μŒ β†’ μƒˆ Podκ°€ μ€€λΉ„ μ•ˆ λλŠ”λ° νŠΈλž˜ν”½ λ°›μŒ

# βœ… Probe μ„€μ •
spec:
  containers:
  - name: app
    image: myapp:1.0
    readinessProbe:
      httpGet:
        path: /health
        port: 8080
      initialDelaySeconds: 10
      periodSeconds: 5

3. DaemonSet maxUnavailable

1
2
3
4
5
# ❌ μœ„ν—˜ (λ…Έλ“œ 3개 ν΄λŸ¬μŠ€ν„°)
maxUnavailable: 3  # λͺ¨λ“  λ…Έλ“œ λ™μ‹œ μž¬μ‹œμž‘ β†’ calico-node 전체 쀑단

# βœ… μ•ˆμ „
maxUnavailable: 1  # ν•œ λ²ˆμ— 1개 λ…Έλ“œμ”©

μ»¨ν…Œμ΄λ„ˆ μž¬μ‹œμž‘ κ΄€λ ¨

1. RESTARTS 계속 증가

1
2
3
4
5
6
# 원인 νŒŒμ•…
kubectl describe pod <pod-name> | grep -A 10 "Last State"

# CrashLoopBackOff β†’ μ½”λ“œ 문제
# OOMKilled β†’ λ©”λͺ¨λ¦¬ λΆ€μ‘±
# Liveness probe failed β†’ Probe μ„€μ • 문제

2. λ©”λͺ¨λ¦¬ λΆ€μ‘± ν•΄κ²°

1
2
3
4
5
6
7
8
9
# ν˜„μž¬ μ‚¬μš©λŸ‰ 확인
kubectl top pod <pod-name>

# Memory Limit 증가
resources:
  requests:
    memory: "256Mi"
  limits:
    memory: "1Gi"  # 512Mi β†’ 1Gi둜 증가

πŸ“ μš”μ•½

Pod vs μ»¨ν…Œμ΄λ„ˆ μž¬μ‹œμž‘

1
2
3
4
5
6
7
8
9
Pod μž¬μ‹œμž‘:
- Pod μ‚­μ œ ν›„ μž¬μƒμ„±
- IP λ³€κ²½, AGE μ΄ˆκΈ°ν™”, RESTARTS 0
- 원인: kubectl delete, rollout, λ…Έλ“œ μž₯μ• 

μ»¨ν…Œμ΄λ„ˆ μž¬μ‹œμž‘:
- Pod μœ μ§€, μ»¨ν…Œμ΄λ„ˆλ§Œ μž¬μ‹œμž‘
- IP μœ μ§€, AGE μœ μ§€, RESTARTS 증가
- 원인: OOMKill, Crash, Probe μ‹€νŒ¨

Rollout 핡심

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
λ™μž‘:
1. spec λ³€κ²½ 감지 (이미지, annotation)
2. updateStrategy에 따라 순차 ꡐ체
3. μƒˆ Pod Running 확인 ν›„ λ‹€μŒ μ§„ν–‰

λͺ…λ Ήμ–΄:
kubectl rollout restart <type> <name>
kubectl rollout status <type> <name>
kubectl rollout pause / resume <type> <name>
kubectl rollout undo <type> <name>
kubectl rollout undo <type> <name> --to-revision=N

μ•ˆμ „ μ„€μ •:
- maxUnavailable: 1 (μ΅œμ†Œ κ°€μš©μ„± μœ μ§€)
- readinessProbe ν•„μˆ˜
- DaemonSet은 maxUnavailable: 1 ꢌμž₯

πŸ”— κ΄€λ ¨ λ¬Έμ„œ

This post is licensed under CC BY 4.0 by the author.

Comments powered by Disqus.