1、背景概述

容器镜像是容器化落地转型的第一步,总结几点需要做镜像优化的原因

docker
  • 缩短部署时的镜像下载时间

  • 提升安全性,减少可供攻击的目标

  • 减少故障恢复时间

  • 节省存储开销


2、为什么镜像会这么大

RepoDocker

2.1 基础镜像过大

A9.67GB
8.72GB

逆向分析了一下,为啥基础镜像还这么大?结果就不用多说了 0.0

2.2 基础镜像过大,而且找不到了

B22.7GB

用到的基础镜像:404 not found,没错,找不到了 0.0

2.3 .git 目录(非必要目录)

这个问题更多内容可以参考我之前的文章 Git目录为什么这么大 [1]

C795MB
.git225MBdockerfile
ADD . /app/startapp/
d300MB
d
├── [ 503]  test_421.json
├── [ 483]  test_havalB9.json
...
├── [ 484]  test_144.json
├── [ 104]  .gitmodules
├── [ 122]  .idea
├── [  0]  __init__.py
├── [ 11M]  164103.zip
├── [108M]  test_180753.csv
├── [ 68M]  test_180753.txt
...
└── [ 335]  README.md

以上其实都不需要提交到镜像中制作成镜像

2.4 Dockerfile 本身有其他问题

Dockerfile
repoDockerfile

正所谓《能用就行》~

3、Dockerfile 如何优化

3.1 从哪里入手

docker

3.1.1 举个栗子

一个实际的例子

nginx:alpine 镜像 23.2MB

# docker history nginx:alpine
IMAGE          CREATED      CREATED BY                                      SIZE      COMMENT
b46db85084b8  9 days ago    /bin/sh -c #(nop)  CMD ["nginx" "-g" "daemon…  0B
<missing>      9 days ago    /bin/sh -c #(nop)  STOPSIGNAL SIGQUIT          0B
<missing>      9 days ago    /bin/sh -c #(nop)  EXPOSE 80                    0B
<missing>      9 days ago    /bin/sh -c #(nop)  ENTRYPOINT ["/docker-entr…  0B
<missing>      9 days ago    /bin/sh -c #(nop) COPY file:09a214a3e07c919a…  4.61kB
<missing>      9 days ago    /bin/sh -c #(nop) COPY file:0fd5fca330dcd6a7…  1.04kB
<missing>      9 days ago    /bin/sh -c #(nop) COPY file:0b866ff3fc1ef5b0…  1.96kB
<missing>      9 days ago    /bin/sh -c #(nop) COPY file:65504f71f5855ca0…  1.2kB
<missing>      9 days ago    /bin/sh -c set -x    && addgroup -g 101 -S …  17.6MB
<missing>      9 days ago    /bin/sh -c #(nop)  ENV PKG_RELEASE=1            0B
<missing>      9 days ago    /bin/sh -c #(nop)  ENV NJS_VERSION=0.7.0        0B
<missing>      9 days ago    /bin/sh -c #(nop)  ENV NGINX_VERSION=1.21.4    0B
<missing>      9 days ago    /bin/sh -c #(nop)  LABEL maintainer=NGINX Do…  0B
<missing>      10 days ago  /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B
<missing>      10 days ago  /bin/sh -c #(nop) ADD file:762c899ec0505d1a3…  5.61MB

python:alpine 镜像 45.5MB

# docker history python:alpine
IMAGE          CREATED      CREATED BY                                      SIZE      COMMENT
382a63bb2f25  10 days ago  /bin/sh -c #(nop)  CMD ["python3"]              0B
<missing>      10 days ago  /bin/sh -c set -ex;  wget -O get-pip.py "$P…  8.31MB
<missing>      10 days ago  /bin/sh -c #(nop)  ENV PYTHON_GET_PIP_SHA256…  0B
<missing>      10 days ago  /bin/sh -c #(nop)  ENV PYTHON_GET_PIP_URL=ht…  0B
<missing>      10 days ago  /bin/sh -c #(nop)  ENV PYTHON_SETUPTOOLS_VER…  0B
<missing>      10 days ago  /bin/sh -c #(nop)  ENV PYTHON_PIP_VERSION=21…  0B
<missing>      10 days ago  /bin/sh -c cd /usr/local/bin  && ln -s idle3…  32B
<missing>      10 days ago  /bin/sh -c set -ex  && apk add --no-cache --…  29.8MB
<missing>      10 days ago  /bin/sh -c #(nop)  ENV PYTHON_VERSION=3.10.0    0B
<missing>      10 days ago  /bin/sh -c #(nop)  ENV GPG_KEY=A035C8C19219B…  0B
<missing>      10 days ago  /bin/sh -c set -eux;  apk add --no-cache  c…  1.82MB
<missing>      10 days ago  /bin/sh -c #(nop)  ENV LANG=C.UTF-8            0B
<missing>      10 days ago  /bin/sh -c #(nop)  ENV PATH=/usr/local/bin:/…  0B
<missing>      10 days ago  /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B
<missing>      10 days ago  /bin/sh -c #(nop) ADD file:762c899ec0505d1a3…  5.61MB

实际存储

# docker inspect nginx:alpine| jq '.[0]|{GraphDriver}'
{
  "GraphDriver": {
    "Data": {
      "LowerDir": "/data/docker-overlay2/overlay2/3d.../diff:/data/docker-overlay2/overlay2/ae.../diff:/data/docker-overlay2/overlay2/ea.../diff:/data/docker-overlay2/overlay2/29.../diff:/data/docker-overlay2/overlay2/5e.../diff",
      "MergedDir": "/data/docker-overlay2/overlay2/b7.../merged",
      "UpperDir": "/data/docker-overlay2/overlay2/b7.../diff",
      "WorkDir": "/data/docker-overlay2/overlay2/b7.../work"
    },
    "Name": "overlay2"
  }
}

分层概念的描述

rootfsrootfsdockerAUFSdevicemapperoverlayoverlay2
docker
  • LowerDir:镜像层

  • MergedDir:整合了 lower 层和 upper 读写层显示出来的视图

  • UpperDir:读写层

  • WorkDir:中间层,对 Upper 层的写入,先写入 WorkDir,再移入 UpperDir

3.1.2 Copy on write

Docker

3.1.3 UnionFS

把多个目录(也叫分支)内容联合挂载到同一个目录下,而目录的物理位置是分开的

nginx:1.15nginx:1.16

3.2 方案

了解了镜像大小的主要构成之后,就很容易知道从哪些方向入手减少镜像大小了

3.2.1 减少镜像层数

DockerfileRUNRUN

举个栗子:

合并前,三层

RUN apk add tzdata
RUN cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
RUN echo "Asia/Shanghai" > /etc/timezone

合并后,一层

RUN apk add tzdata \
    && cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
    && echo "Asia/Shanghai" > /etc/timezone

3.2.2 减少每层镜像大小

3.2.2.1 选用更小的基础镜像
scratchscratchpausescratchlinuxapk
3.3.2.2 多阶段构建
DockerfileFROMFROM
FROM … AS …COPY --from
java812MB
FROM centos AS jdk
COPY jdk-8u231-linux-x64.tar.gz /usr/local/src
RUN cd /usr/local/src && \
    tar -xzvf jdk-8u231-linux-x64.tar.gz -C /usr/local

618MB
FROM centos AS jdk
COPY jdk-8u231-linux-x64.tar.gz /usr/local/src
RUN cd /usr/local/src && \
    tar -xzvf jdk-8u231-linux-x64.tar.gz -C /usr/local

FROM centos
COPY --from=jdk /usr/local/jdk1.8.0_231 /usr/local
3.3.2.3 忽略文件
build context
docker buildDocker Daemon
docker buildSending build context to Docker daemon xxxMB
RUN--no-cachedocker build
.dockerignoreDockergit.gitignore
3.3.2.4 远程下载
ADD
RUN curl -s http://192.168.1.1/repository/tools/jdk-8u241-linux-x64.tar.gz | tar -xC /opt/
3.3.2.5 拆分 COPY
COPYA4AA/BB/CC/DDCOPY
COPY
COPY A/AA /app/A/AA
COPY A/BB /app/A/BB
COPY A/CC /app/A/CC
COPY A/DD /app/A/DD
3.3.2.6 构建时挂载

构建时挂载(扩展功能[3]

配置

--experimental# syntax=docker/dockerfile:1.1.1-experimental

使用

  • 挂载本地 golang 缓存
# syntax = docker/dockerfile:experimental
FROM golang
...
RUN --mount=type=cache,target=/root/.cache/go-build go build ...
  • 挂载 cache 目录
# syntax = docker/dockerfile:experimental
FROM ubuntu
RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt \
  apt update && apt install -y gcc
  • 挂载某些凭据
# syntax = docker/dockerfile:experimental
FROM python:3
RUN pip install awscli
RUN --mount=type=secret,id=aws,target=/root/.aws/credentials aws s3 cp s3://... ...

等等

3.3.2.7 构建后清理
  • 删除压缩包
  • 清理安装缓存
    • --no-cache
    • rm -rf /var/lib/apt/lists/*
    • rm -rf /var/cache/yum/*
3.3.2.8 镜像压缩
exportimport

这种方法不好的就是会丢失一部分镜像信息

# docker run -d --name nginx nginx:alpine
# docker export nginx |docker import - nginx:alpine2
sha256:dd6a3cf822ac3c3ad3e7f7b31675cd8cd99a6f80e360996e04da6fc2f3b98cb5
# docker history nginx:alpine
IMAGE          CREATED      CREATED BY                                      SIZE      COMMENT
b46db85084b8  10 days ago  /bin/sh -c #(nop)  CMD ["nginx" "-g" "daemon…  0B
<missing>      10 days ago  /bin/sh -c #(nop)  STOPSIGNAL SIGQUIT          0B
<missing>      10 days ago  /bin/sh -c #(nop)  EXPOSE 80                    0B
<missing>      10 days ago  /bin/sh -c #(nop)  ENTRYPOINT ["/docker-entr…  0B
<missing>      10 days ago  /bin/sh -c #(nop) COPY file:09a214a3e07c919a…  4.61kB
<missing>      10 days ago  /bin/sh -c #(nop) COPY file:0fd5fca330dcd6a7…  1.04kB
<missing>      10 days ago  /bin/sh -c #(nop) COPY file:0b866ff3fc1ef5b0…  1.96kB
<missing>      10 days ago  /bin/sh -c #(nop) COPY file:65504f71f5855ca0…  1.2kB
<missing>      10 days ago  /bin/sh -c set -x    && addgroup -g 101 -S …  17.6MB
<missing>      10 days ago  /bin/sh -c #(nop)  ENV PKG_RELEASE=1            0B
<missing>      10 days ago  /bin/sh -c #(nop)  ENV NJS_VERSION=0.7.0        0B
<missing>      10 days ago  /bin/sh -c #(nop)  ENV NGINX_VERSION=1.21.4    0B
<missing>      10 days ago  /bin/sh -c #(nop)  LABEL maintainer=NGINX Do…  0B
<missing>      10 days ago  /bin/sh -c #(nop)  CMD ["/bin/sh"]              0B
<missing>      10 days ago  /bin/sh -c #(nop) ADD file:762c899ec0505d1a3…  5.61MB
# docker history nginx:alpine2
IMAGE          CREATED          CREATED BY  SIZE      COMMENT
dd6a3cf822ac  40 seconds ago                23MB      Imported from -
# docker images|grep nginx
nginx                                                                                                              alpine2                    dd6a3cf822ac  54 seconds ago  23MB
nginx                                                                                                              alpine                      b46db85084b8  10 days ago      23.2MB

3.3 样例

3.3.1 go 样例

样例一

kubeadmk8skube-apiserverDockerfilebazel
bazel build ...
LABEL maintainers=Kubernetes Authors
LABEL description=go based runner for distroless scenarios
WORKDIR /
COPY /workspace/go-runner . # buildkit
ENTRYPOINT ["/go-runner"]
COPY file:2e904ea733ba0ded2a99947847de31414a19d83f8495dd8c1fbed3c70bf67a22 in /usr/local/bin/kube-apiserver

代码目录 28M(包含.git 目录 20.5M)

镜像大小 122MB

样例二

CadenceDockerfile
ARG TARGET=server

# Can be used in case a proxy is necessary
ARG GOPROXY

# Build tcheck binary
FROM golang:1.17-alpine3.13 AS tcheck

WORKDIR /go/src/github.com/uber/tcheck

COPY go.* ./
RUN go build -mod=readonly -o /go/bin/tcheck github.com/uber/tcheck

# Build Cadence binaries
FROM golang:1.17-alpine3.13 AS builder

ARG RELEASE_VERSION

RUN apk add --update --no-cache ca-certificates make git curl mercurial unzip

WORKDIR /cadence

# Making sure that dependency is not touched
ENV GOFLAGS="-mod=readonly"

# Copy go mod dependencies and build cache
COPY go.* ./
RUN go mod download

COPY . .
RUN rm -fr .bin .build

ENV CADENCE_RELEASE_VERSION=$RELEASE_VERSION

# bypass codegen, use committed files.  must be run separately, before building things.
RUN make .fake-codegen
RUN CGO_ENABLED=0 make copyright cadence-cassandra-tool cadence-sql-tool cadence cadence-server cadence-bench cadence-canary


# Download dockerize
FROM alpine:3.11 AS dockerize

RUN apk add --no-cache openssl

ENV DOCKERIZE_VERSION v0.6.1
RUN wget https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-alpine-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
    && tar -C /usr/local/bin -xzvf dockerize-alpine-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
    && rm dockerize-alpine-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
    && echo "**** fix for host id mapping error ****" \
    && chown root:root /usr/local/bin/dockerize


# Alpine base image
FROM alpine:3.11 AS alpine

RUN apk add --update --no-cache ca-certificates tzdata bash curl

# set up nsswitch.conf for Go's "netgo" implementation
# https://github.com/gliderlabs/docker-alpine/issues/367#issuecomment-424546457
RUN test ! -e /etc/nsswitch.conf && echo 'hosts: files dns' > /etc/nsswitch.conf

SHELL ["/bin/bash", "-c"]


# Cadence server
FROM alpine AS cadence-server

ENV CADENCE_HOME /etc/cadence
RUN mkdir -p /etc/cadence

COPY --from=tcheck /go/bin/tcheck /usr/local/bin
COPY --from=dockerize /usr/local/bin/dockerize /usr/local/bin
COPY --from=builder /cadence/cadence-cassandra-tool /usr/local/bin
COPY --from=builder /cadence/cadence-sql-tool /usr/local/bin
COPY --from=builder /cadence/cadence /usr/local/bin
COPY --from=builder /cadence/cadence-server /usr/local/bin
COPY --from=builder /cadence/schema /etc/cadence/schema

COPY docker/entrypoint.sh /docker-entrypoint.sh
COPY config/dynamicconfig /etc/cadence/config/dynamicconfig
COPY config/credentials /etc/cadence/config/credentials
COPY docker/config_template.yaml /etc/cadence/config
COPY docker/start-cadence.sh /start-cadence.sh

WORKDIR /etc/cadence

ENV SERVICES="history,matching,frontend,worker"

EXPOSE 7933 7934 7935 7939
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD /start-cadence.sh


# All-in-one Cadence server
FROM cadence-server AS cadence-auto-setup

RUN apk add --update --no-cache ca-certificates py-pip mysql-client
RUN pip install cqlsh

COPY docker/start.sh /start.sh

CMD /start.sh


# Cadence CLI
FROM alpine AS cadence-cli

COPY --from=tcheck /go/bin/tcheck /usr/local/bin
COPY --from=builder /cadence/cadence /usr/local/bin

ENTRYPOINT ["cadence"]

# Cadence Canary
FROM alpine AS cadence-canary

COPY --from=builder /cadence/cadence-canary /usr/local/bin
COPY --from=builder /cadence/cadence /usr/local/bin

CMD ["/usr/local/bin/cadence-canary", "--root", "/etc/cadence-canary", "start"]

# Cadence Bench
FROM alpine AS cadence-bench

COPY --from=builder /cadence/cadence-bench /usr/local/bin
COPY --from=builder /cadence/cadence /usr/local/bin

CMD ["/usr/local/bin/cadence-bench", "--root", "/etc/cadence-bench", "start"]

# Final image
FROM cadence-${TARGET}

代码目录 85.4M(包含.git 目录 57.7M)

镜像大小 135.69MB

3.3.2 py 样例

FROM python:3.4

RUN apt-get update \
    && apt-get install -y --no-install-recommends \
        postgresql-client \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /usr/src/app
COPY requirements.txt ./
RUN pip install -r requirements.txt
COPY . .

EXPOSE 8000
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]

代码目录 275M(包含.git 目录 222M)

镜像大小 436MB

4、除了这些优化还可以做什么

4.1 设置字符集

Dockerfile
# Set lang
ENV LANG "en_US.UTF-8"

4.2 时区校正

这个问题更多内容可以参考我之前的文章 k8s环境下处理容器时间问题的多种姿势 [4]

Dockerfile
# Set timezone
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
  && echo "Asia/Shanghai" > /etc/timezone

4.3 进程管理

dockerDockerfileENTRYPOINTCMDPID1

除此之外,这个主进程还有一个重要的作用就是管理“僵尸进程”

exit

清理“僵尸进程”的思路主要有

SIGCHLDSIG_IGNforkinit

目前可以实现的开源方案

Tini

tiniinitinit

优点

tinitiniDockerTiniSIGTERM

示例

# Add Tini
ENV TINI_VERSION v0.19.0
ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini /tini
RUN chmod +x /tini
ENTRYPOINT ["/tini", "--"]

# Run your program under Tini
CMD ["/your/program", "-and", "-its", "arguments"]
# or docker run your-image /your/program ...

dumb-init

dumb-init
dumb-initDUMB_INIT_SETSID=0
dumb-init

示例

FROM alpine:3.11.5
RUN sed -i "s/dl-cdn.alpinelinux.org/mirrors.aliyun.com/g" /etc/apk/repositories \
    && apk add --no-cache dumb-init

# Runs "/usr/bin/dumb-init -- /my/script --with --args"
ENTRYPOINT ["dumb-init", "--"]

# or if you use --rewrite or other cli flags
# ENTRYPOINT ["dumb-init", "--rewrite", "2:3", "--"]

CMD ["/my/script", "--with", "--args"]

4.4 降权启动

vmnginx
tomcat
...
USER tomcat
WORKDIR /usr/local/tomcat
EXPOSE 8080
ENTRYPOINT ["catalina.sh","run"]
sudodockersudosudoTTYrootrootgosu
ENTRYPOINT
#!/bin/bash
set -e

if [ "$1" = 'postgres' ]; then
    chown -R postgres "$PGDATA"

    if [ -z "$(ls -A "$PGDATA")" ]; then
        gosu postgres initdb
    fi

    exec gosu postgres "$@"
fi

exec "$@"

4.5 底层库依赖

alpinejava
alpinejdk/jreglibcglibcca-certificatesglibc
alpinejdk8jdkjavaGUN Standard C library(glibc)alpineMUSL libc(mini libc)alpineglibc

5、小结

Dockerfile

See you ~

参考资料

[1]

https://www.ssgeek.com/post/git-mu-lu-wei-shi-me-zhe-me-da/

[2]

https://docs.docker.com/develop/develop-images/baseimages/

[3]

https://docs.docker.com/engine/reference/commandline/dockerd/#description

[4]

https://www.ssgeek.com/post/k8s-huan-jing-xia-chu-li-rong-qi-shi-jian-wen-ti-de-duo-chong-zi-shi/

[5]

https://hub.docker.com/_/postgres/