1、背景概述
容器镜像是容器化落地转型的第一步,总结几点需要做镜像优化的原因
docker缩短部署时的镜像下载时间
提升安全性,减少可供攻击的目标
减少故障恢复时间
节省存储开销
2、为什么镜像会这么大
RepoDocker2.1 基础镜像过大
A9.67GB8.72GB逆向分析了一下,为啥基础镜像还这么大?结果就不用多说了 0.0
2.2 基础镜像过大,而且找不到了
B22.7GB用到的基础镜像:404 not found,没错,找不到了 0.0
2.3 .git 目录(非必要目录)
这个问题更多内容可以参考我之前的文章 Git目录为什么这么大 [1]
C795MB.git225MBdockerfileADD . /app/startapp/
d300MBd
├── [ 503] test_421.json
├── [ 483] test_havalB9.json
...
├── [ 484] test_144.json
├── [ 104] .gitmodules
├── [ 122] .idea
├── [ 0] __init__.py
├── [ 11M] 164103.zip
├── [108M] test_180753.csv
├── [ 68M] test_180753.txt
...
└── [ 335] README.md
以上其实都不需要提交到镜像中制作成镜像
2.4 Dockerfile 本身有其他问题
DockerfilerepoDockerfile正所谓《能用就行》~
3、Dockerfile 如何优化
3.1 从哪里入手
docker3.1.1 举个栗子
一个实际的例子
nginx:alpine 镜像 23.2MB
# docker history nginx:alpine
IMAGE CREATED CREATED BY SIZE COMMENT
b46db85084b8 9 days ago /bin/sh -c #(nop) CMD ["nginx" "-g" "daemon… 0B
<missing> 9 days ago /bin/sh -c #(nop) STOPSIGNAL SIGQUIT 0B
<missing> 9 days ago /bin/sh -c #(nop) EXPOSE 80 0B
<missing> 9 days ago /bin/sh -c #(nop) ENTRYPOINT ["/docker-entr… 0B
<missing> 9 days ago /bin/sh -c #(nop) COPY file:09a214a3e07c919a… 4.61kB
<missing> 9 days ago /bin/sh -c #(nop) COPY file:0fd5fca330dcd6a7… 1.04kB
<missing> 9 days ago /bin/sh -c #(nop) COPY file:0b866ff3fc1ef5b0… 1.96kB
<missing> 9 days ago /bin/sh -c #(nop) COPY file:65504f71f5855ca0… 1.2kB
<missing> 9 days ago /bin/sh -c set -x && addgroup -g 101 -S … 17.6MB
<missing> 9 days ago /bin/sh -c #(nop) ENV PKG_RELEASE=1 0B
<missing> 9 days ago /bin/sh -c #(nop) ENV NJS_VERSION=0.7.0 0B
<missing> 9 days ago /bin/sh -c #(nop) ENV NGINX_VERSION=1.21.4 0B
<missing> 9 days ago /bin/sh -c #(nop) LABEL maintainer=NGINX Do… 0B
<missing> 10 days ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B
<missing> 10 days ago /bin/sh -c #(nop) ADD file:762c899ec0505d1a3… 5.61MB
python:alpine 镜像 45.5MB
# docker history python:alpine
IMAGE CREATED CREATED BY SIZE COMMENT
382a63bb2f25 10 days ago /bin/sh -c #(nop) CMD ["python3"] 0B
<missing> 10 days ago /bin/sh -c set -ex; wget -O get-pip.py "$P… 8.31MB
<missing> 10 days ago /bin/sh -c #(nop) ENV PYTHON_GET_PIP_SHA256… 0B
<missing> 10 days ago /bin/sh -c #(nop) ENV PYTHON_GET_PIP_URL=ht… 0B
<missing> 10 days ago /bin/sh -c #(nop) ENV PYTHON_SETUPTOOLS_VER… 0B
<missing> 10 days ago /bin/sh -c #(nop) ENV PYTHON_PIP_VERSION=21… 0B
<missing> 10 days ago /bin/sh -c cd /usr/local/bin && ln -s idle3… 32B
<missing> 10 days ago /bin/sh -c set -ex && apk add --no-cache --… 29.8MB
<missing> 10 days ago /bin/sh -c #(nop) ENV PYTHON_VERSION=3.10.0 0B
<missing> 10 days ago /bin/sh -c #(nop) ENV GPG_KEY=A035C8C19219B… 0B
<missing> 10 days ago /bin/sh -c set -eux; apk add --no-cache c… 1.82MB
<missing> 10 days ago /bin/sh -c #(nop) ENV LANG=C.UTF-8 0B
<missing> 10 days ago /bin/sh -c #(nop) ENV PATH=/usr/local/bin:/… 0B
<missing> 10 days ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B
<missing> 10 days ago /bin/sh -c #(nop) ADD file:762c899ec0505d1a3… 5.61MB
实际存储
# docker inspect nginx:alpine| jq '.[0]|{GraphDriver}'
{
"GraphDriver": {
"Data": {
"LowerDir": "/data/docker-overlay2/overlay2/3d.../diff:/data/docker-overlay2/overlay2/ae.../diff:/data/docker-overlay2/overlay2/ea.../diff:/data/docker-overlay2/overlay2/29.../diff:/data/docker-overlay2/overlay2/5e.../diff",
"MergedDir": "/data/docker-overlay2/overlay2/b7.../merged",
"UpperDir": "/data/docker-overlay2/overlay2/b7.../diff",
"WorkDir": "/data/docker-overlay2/overlay2/b7.../work"
},
"Name": "overlay2"
}
}
分层概念的描述
rootfsrootfsdockerAUFSdevicemapperoverlayoverlay2dockerLowerDir:镜像层
MergedDir:整合了 lower 层和 upper 读写层显示出来的视图
UpperDir:读写层
WorkDir:中间层,对 Upper 层的写入,先写入 WorkDir,再移入 UpperDir
3.1.2 Copy on write
Docker3.1.3 UnionFS
把多个目录(也叫分支)内容联合挂载到同一个目录下,而目录的物理位置是分开的
nginx:1.15nginx:1.163.2 方案
了解了镜像大小的主要构成之后,就很容易知道从哪些方向入手减少镜像大小了
3.2.1 减少镜像层数
DockerfileRUNRUN举个栗子:
合并前,三层
RUN apk add tzdata
RUN cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime
RUN echo "Asia/Shanghai" > /etc/timezone
合并后,一层
RUN apk add tzdata \
&& cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
&& echo "Asia/Shanghai" > /etc/timezone
3.2.2 减少每层镜像大小
3.2.2.1 选用更小的基础镜像
scratchscratchpausescratchlinuxapk3.3.2.2 多阶段构建
DockerfileFROMFROMFROM … AS …COPY --fromjava812MBFROM centos AS jdk
COPY jdk-8u231-linux-x64.tar.gz /usr/local/src
RUN cd /usr/local/src && \
tar -xzvf jdk-8u231-linux-x64.tar.gz -C /usr/local
618MBFROM centos AS jdk
COPY jdk-8u231-linux-x64.tar.gz /usr/local/src
RUN cd /usr/local/src && \
tar -xzvf jdk-8u231-linux-x64.tar.gz -C /usr/local
FROM centos
COPY --from=jdk /usr/local/jdk1.8.0_231 /usr/local
3.3.2.3 忽略文件
build contextdocker buildDocker Daemondocker buildSending build context to Docker daemon xxxMBRUN--no-cachedocker build.dockerignoreDockergit.gitignore3.3.2.4 远程下载
ADDRUN curl -s http://192.168.1.1/repository/tools/jdk-8u241-linux-x64.tar.gz | tar -xC /opt/
3.3.2.5 拆分 COPY
COPYA4AA/BB/CC/DDCOPYCOPYCOPY A/AA /app/A/AA
COPY A/BB /app/A/BB
COPY A/CC /app/A/CC
COPY A/DD /app/A/DD
3.3.2.6 构建时挂载
构建时挂载(扩展功能[3])
配置
--experimental# syntax=docker/dockerfile:1.1.1-experimental使用
挂载本地 golang 缓存
# syntax = docker/dockerfile:experimental
FROM golang
...
RUN --mount=type=cache,target=/root/.cache/go-build go build ...
挂载 cache 目录
# syntax = docker/dockerfile:experimental
FROM ubuntu
RUN rm -f /etc/apt/apt.conf.d/docker-clean; echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache
RUN --mount=type=cache,target=/var/cache/apt --mount=type=cache,target=/var/lib/apt \
apt update && apt install -y gcc
挂载某些凭据
# syntax = docker/dockerfile:experimental
FROM python:3
RUN pip install awscli
RUN --mount=type=secret,id=aws,target=/root/.aws/credentials aws s3 cp s3://... ...
等等
3.3.2.7 构建后清理
删除压缩包 清理安装缓存 --no-cache rm -rf /var/lib/apt/lists/* rm -rf /var/cache/yum/*
3.3.2.8 镜像压缩
exportimport这种方法不好的就是会丢失一部分镜像信息
# docker run -d --name nginx nginx:alpine
# docker export nginx |docker import - nginx:alpine2
sha256:dd6a3cf822ac3c3ad3e7f7b31675cd8cd99a6f80e360996e04da6fc2f3b98cb5
# docker history nginx:alpine
IMAGE CREATED CREATED BY SIZE COMMENT
b46db85084b8 10 days ago /bin/sh -c #(nop) CMD ["nginx" "-g" "daemon… 0B
<missing> 10 days ago /bin/sh -c #(nop) STOPSIGNAL SIGQUIT 0B
<missing> 10 days ago /bin/sh -c #(nop) EXPOSE 80 0B
<missing> 10 days ago /bin/sh -c #(nop) ENTRYPOINT ["/docker-entr… 0B
<missing> 10 days ago /bin/sh -c #(nop) COPY file:09a214a3e07c919a… 4.61kB
<missing> 10 days ago /bin/sh -c #(nop) COPY file:0fd5fca330dcd6a7… 1.04kB
<missing> 10 days ago /bin/sh -c #(nop) COPY file:0b866ff3fc1ef5b0… 1.96kB
<missing> 10 days ago /bin/sh -c #(nop) COPY file:65504f71f5855ca0… 1.2kB
<missing> 10 days ago /bin/sh -c set -x && addgroup -g 101 -S … 17.6MB
<missing> 10 days ago /bin/sh -c #(nop) ENV PKG_RELEASE=1 0B
<missing> 10 days ago /bin/sh -c #(nop) ENV NJS_VERSION=0.7.0 0B
<missing> 10 days ago /bin/sh -c #(nop) ENV NGINX_VERSION=1.21.4 0B
<missing> 10 days ago /bin/sh -c #(nop) LABEL maintainer=NGINX Do… 0B
<missing> 10 days ago /bin/sh -c #(nop) CMD ["/bin/sh"] 0B
<missing> 10 days ago /bin/sh -c #(nop) ADD file:762c899ec0505d1a3… 5.61MB
# docker history nginx:alpine2
IMAGE CREATED CREATED BY SIZE COMMENT
dd6a3cf822ac 40 seconds ago 23MB Imported from -
# docker images|grep nginx
nginx alpine2 dd6a3cf822ac 54 seconds ago 23MB
nginx alpine b46db85084b8 10 days ago 23.2MB
3.3 样例
3.3.1 go 样例
样例一
kubeadmk8skube-apiserverDockerfilebazelbazel build ...
LABEL maintainers=Kubernetes Authors
LABEL description=go based runner for distroless scenarios
WORKDIR /
COPY /workspace/go-runner . # buildkit
ENTRYPOINT ["/go-runner"]
COPY file:2e904ea733ba0ded2a99947847de31414a19d83f8495dd8c1fbed3c70bf67a22 in /usr/local/bin/kube-apiserver
代码目录 28M(包含.git 目录 20.5M)
镜像大小 122MB
样例二
CadenceDockerfileARG TARGET=server
# Can be used in case a proxy is necessary
ARG GOPROXY
# Build tcheck binary
FROM golang:1.17-alpine3.13 AS tcheck
WORKDIR /go/src/github.com/uber/tcheck
COPY go.* ./
RUN go build -mod=readonly -o /go/bin/tcheck github.com/uber/tcheck
# Build Cadence binaries
FROM golang:1.17-alpine3.13 AS builder
ARG RELEASE_VERSION
RUN apk add --update --no-cache ca-certificates make git curl mercurial unzip
WORKDIR /cadence
# Making sure that dependency is not touched
ENV GOFLAGS="-mod=readonly"
# Copy go mod dependencies and build cache
COPY go.* ./
RUN go mod download
COPY . .
RUN rm -fr .bin .build
ENV CADENCE_RELEASE_VERSION=$RELEASE_VERSION
# bypass codegen, use committed files. must be run separately, before building things.
RUN make .fake-codegen
RUN CGO_ENABLED=0 make copyright cadence-cassandra-tool cadence-sql-tool cadence cadence-server cadence-bench cadence-canary
# Download dockerize
FROM alpine:3.11 AS dockerize
RUN apk add --no-cache openssl
ENV DOCKERIZE_VERSION v0.6.1
RUN wget https://github.com/jwilder/dockerize/releases/download/$DOCKERIZE_VERSION/dockerize-alpine-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
&& tar -C /usr/local/bin -xzvf dockerize-alpine-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
&& rm dockerize-alpine-linux-amd64-$DOCKERIZE_VERSION.tar.gz \
&& echo "**** fix for host id mapping error ****" \
&& chown root:root /usr/local/bin/dockerize
# Alpine base image
FROM alpine:3.11 AS alpine
RUN apk add --update --no-cache ca-certificates tzdata bash curl
# set up nsswitch.conf for Go's "netgo" implementation
# https://github.com/gliderlabs/docker-alpine/issues/367#issuecomment-424546457
RUN test ! -e /etc/nsswitch.conf && echo 'hosts: files dns' > /etc/nsswitch.conf
SHELL ["/bin/bash", "-c"]
# Cadence server
FROM alpine AS cadence-server
ENV CADENCE_HOME /etc/cadence
RUN mkdir -p /etc/cadence
COPY --from=tcheck /go/bin/tcheck /usr/local/bin
COPY --from=dockerize /usr/local/bin/dockerize /usr/local/bin
COPY --from=builder /cadence/cadence-cassandra-tool /usr/local/bin
COPY --from=builder /cadence/cadence-sql-tool /usr/local/bin
COPY --from=builder /cadence/cadence /usr/local/bin
COPY --from=builder /cadence/cadence-server /usr/local/bin
COPY --from=builder /cadence/schema /etc/cadence/schema
COPY docker/entrypoint.sh /docker-entrypoint.sh
COPY config/dynamicconfig /etc/cadence/config/dynamicconfig
COPY config/credentials /etc/cadence/config/credentials
COPY docker/config_template.yaml /etc/cadence/config
COPY docker/start-cadence.sh /start-cadence.sh
WORKDIR /etc/cadence
ENV SERVICES="history,matching,frontend,worker"
EXPOSE 7933 7934 7935 7939
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD /start-cadence.sh
# All-in-one Cadence server
FROM cadence-server AS cadence-auto-setup
RUN apk add --update --no-cache ca-certificates py-pip mysql-client
RUN pip install cqlsh
COPY docker/start.sh /start.sh
CMD /start.sh
# Cadence CLI
FROM alpine AS cadence-cli
COPY --from=tcheck /go/bin/tcheck /usr/local/bin
COPY --from=builder /cadence/cadence /usr/local/bin
ENTRYPOINT ["cadence"]
# Cadence Canary
FROM alpine AS cadence-canary
COPY --from=builder /cadence/cadence-canary /usr/local/bin
COPY --from=builder /cadence/cadence /usr/local/bin
CMD ["/usr/local/bin/cadence-canary", "--root", "/etc/cadence-canary", "start"]
# Cadence Bench
FROM alpine AS cadence-bench
COPY --from=builder /cadence/cadence-bench /usr/local/bin
COPY --from=builder /cadence/cadence /usr/local/bin
CMD ["/usr/local/bin/cadence-bench", "--root", "/etc/cadence-bench", "start"]
# Final image
FROM cadence-${TARGET}
代码目录 85.4M(包含.git 目录 57.7M)
镜像大小 135.69MB
3.3.2 py 样例
FROM python:3.4
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
postgresql-client \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /usr/src/app
COPY requirements.txt ./
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]
代码目录 275M(包含.git 目录 222M)
镜像大小 436MB
4、除了这些优化还可以做什么
4.1 设置字符集
Dockerfile# Set lang
ENV LANG "en_US.UTF-8"
4.2 时区校正
这个问题更多内容可以参考我之前的文章 k8s环境下处理容器时间问题的多种姿势 [4]
Dockerfile# Set timezone
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime \
&& echo "Asia/Shanghai" > /etc/timezone
4.3 进程管理
dockerDockerfileENTRYPOINTCMDPID1除此之外,这个主进程还有一个重要的作用就是管理“僵尸进程”
exit清理“僵尸进程”的思路主要有
SIGCHLDSIG_IGNforkinit目前可以实现的开源方案
Tini
tiniinitinit优点
tinitiniDockerTiniSIGTERM示例
# Add Tini
ENV TINI_VERSION v0.19.0
ADD https://github.com/krallin/tini/releases/download/${TINI_VERSION}/tini /tini
RUN chmod +x /tini
ENTRYPOINT ["/tini", "--"]
# Run your program under Tini
CMD ["/your/program", "-and", "-its", "arguments"]
# or docker run your-image /your/program ...
dumb-init
dumb-initdumb-initDUMB_INIT_SETSID=0dumb-init示例
FROM alpine:3.11.5
RUN sed -i "s/dl-cdn.alpinelinux.org/mirrors.aliyun.com/g" /etc/apk/repositories \
&& apk add --no-cache dumb-init
# Runs "/usr/bin/dumb-init -- /my/script --with --args"
ENTRYPOINT ["dumb-init", "--"]
# or if you use --rewrite or other cli flags
# ENTRYPOINT ["dumb-init", "--rewrite", "2:3", "--"]
CMD ["/my/script", "--with", "--args"]
4.4 降权启动
vmnginxtomcat...
USER tomcat
WORKDIR /usr/local/tomcat
EXPOSE 8080
ENTRYPOINT ["catalina.sh","run"]
sudodockersudosudoTTYrootrootgosuENTRYPOINT#!/bin/bash
set -e
if [ "$1" = 'postgres' ]; then
chown -R postgres "$PGDATA"
if [ -z "$(ls -A "$PGDATA")" ]; then
gosu postgres initdb
fi
exec gosu postgres "$@"
fi
exec "$@"
4.5 底层库依赖
alpinejavaalpinejdk/jreglibcglibcca-certificatesglibcalpinejdk8jdkjavaGUN Standard C library(glibc)alpineMUSL libc(mini libc)alpineglibc5、小结
DockerfileSee you ~
参考资料
https://www.ssgeek.com/post/git-mu-lu-wei-shi-me-zhe-me-da/
[2]https://docs.docker.com/develop/develop-images/baseimages/
[3]https://docs.docker.com/engine/reference/commandline/dockerd/#description
[4]https://www.ssgeek.com/post/k8s-huan-jing-xia-chu-li-rong-qi-shi-jian-wen-ti-de-duo-chong-zi-shi/
[5]https://hub.docker.com/_/postgres/