基本框架
This commit is contained in:
2
.gitattributes
vendored
Normal file
2
.gitattributes
vendored
Normal file
@@ -0,0 +1,2 @@
|
||||
# Shell scripts: always LF (fixes env: 'bash\r' on Linux)
|
||||
*.sh text eol=lf
|
||||
5
.gitignore
vendored
Normal file
5
.gitignore
vendored
Normal file
@@ -0,0 +1,5 @@
|
||||
.cursor
|
||||
.ssh
|
||||
_bmad
|
||||
_bmad-output
|
||||
design-artifacts
|
||||
63
README.md
Normal file
63
README.md
Normal file
@@ -0,0 +1,63 @@
|
||||
# 实验室建设(新手入口)
|
||||
|
||||
这个仓库是一个家庭实验室(Homelab)实战记录:
|
||||
从 0 开始搭 K3s 集群、配 Traefik 入口、跑常见服务(比如 GitLab、监控、首页面板),并配好排障脚本。
|
||||
|
||||
如果你是第一次看,不用担心,按下面顺序一步一步来就行。
|
||||
|
||||
## 先知道这仓库怎么逛
|
||||
|
||||
- 文档主入口:`docs/00-00-构建总览.md`
|
||||
- 部署环境说明:`docs/00-04-部署环境说明.md`(节点布局、IP、版本等)
|
||||
- 脚本主入口:`scripts/README.md`
|
||||
- 验证状态一览:`docs/00-02-验证矩阵.md`
|
||||
|
||||
简单理解这三份入口的分工:
|
||||
|
||||
- `README.md`:新手入口,看“要做什么、按什么顺序做”;
|
||||
- `00-00-构建总览.md`:文档导航,看“下一步该看哪一篇”;
|
||||
- `00-01-k3s-基础概念.md`:概念速查,看“不懂的 K3s/Traefik/NetworkPolicy 术语”;
|
||||
- `00-02-验证矩阵.md`:状态面板,看“哪些文档已经在真实环境跑通过”。
|
||||
|
||||
目录约定很简单:
|
||||
|
||||
- 主文档都在 `docs/`
|
||||
- 脚本都在 `scripts/`
|
||||
- 脚本默认从仓库根目录执行(例如 `./scripts/...`)
|
||||
|
||||
## 新手推荐安装顺序(口语版)
|
||||
|
||||
1. **先看总览,别急着装**
|
||||
打开 `docs/00-00-构建总览.md`,先把整体拓扑和机器分工看明白。
|
||||
|
||||
2. **装 K3s 集群(两种方式二选一)**
|
||||
- **自动化**:按 `docs/01-07-节点初始化-ansible-实践.md` 用 Ansible 一键完成 61~64 初始化 + server/worker 安装(已验证)。
|
||||
- **手动**:先按 `docs/01-01-k3s-控制节点含traefik.md` 装控制节点 61,再按 `docs/01-02-k3s-工作节点.md` 加工作节点 62~64。
|
||||
|
||||
3. **确认节点 Ready**
|
||||
执行 `kubectl get nodes`,确认所有节点 Ready。
|
||||
|
||||
4. **先用 nginx 做最小验证**
|
||||
按 `docs/04-03-k3s-nginx-demo.md`,先打通“能访问”这件事,再上 nodejs。
|
||||
|
||||
5. **再做 nodejs、dashboard、acme**
|
||||
对应看 `docs/04-01-k3s-nodejs-高级部署.md`、`docs/03-01-k3s-traefik-dashboard.md`、`docs/03-02-k3s-traefik-acme.md`。
|
||||
|
||||
6. **遇到 502/不通,直接用脚本排障**
|
||||
去 `scripts/README.md` 抄命令,优先跑入口链路诊断和 firewalld 基线脚本。
|
||||
|
||||
## 30 分钟快速通关(最小必做)
|
||||
|
||||
如果你时间有限,先只做这 4 步,跑通再扩展:
|
||||
|
||||
1. **装集群**:用 Ansible 按 `docs/01-07-节点初始化-ansible-实践.md` 一键安装(推荐);或按 `docs/01-01` + `docs/01-02` 手动装控制节点(61)与工作节点(62)
|
||||
2. 执行 `kubectl get nodes`,确认节点 Ready
|
||||
3. 按 `docs/04-03-k3s-nginx-demo.md` 部署 nginx 示例并访问一次
|
||||
4. 若访问不通,按 `scripts/README.md` 先跑 firewalld 基线与入口链路诊断脚本
|
||||
|
||||
跑到这里就算「基础链路通关」。后面再继续 nodejs、dashboard、acme 会轻松很多。
|
||||
如果你愿意,也可以顺手在 `docs/00-02-验证矩阵.md` 里,把对应文档的状态改成“已验证”,方便以后回顾。
|
||||
|
||||
## 一句话建议
|
||||
|
||||
先把基础链路(61/62:80)跑通,再叠加业务;每做完一步都做一次 `curl` 验证,排障会轻松很多。
|
||||
5
ansible/ansible.cfg
Normal file
5
ansible/ansible.cfg
Normal file
@@ -0,0 +1,5 @@
|
||||
[defaults]
|
||||
# 首次连接时跳过 host key 确认(实验室环境可接受)
|
||||
host_key_checking = False
|
||||
# 使用 inventory 同目录
|
||||
inventory = inventory.ini
|
||||
11
ansible/files/01-08-haproxy/README.md
Normal file
11
ansible/files/01-08-haproxy/README.md
Normal file
@@ -0,0 +1,11 @@
|
||||
# 01-08 HAProxy 配置
|
||||
|
||||
用于 `docs/01-08-openwrt-haproxy.md`,可与 Ansible 共用(复制到 OpenWrt 或通过 playbook 下发)。
|
||||
|
||||
| 文件 | 说明 |
|
||||
|------|------|
|
||||
| haproxy.cfg | 基础配置,TCP 健康检查 |
|
||||
| haproxy-proxy.cfg | 启用 send-proxy-v2(Traefik 真实 IP) |
|
||||
| haproxy-proxy-http-tls.cfg | HTTP 检查 + TLS 检查 + PROXY 组合 |
|
||||
|
||||
按实际节点 IP 修改 `192.168.2.61`~`192.168.2.64`。80/443 被封时可将 `bind *:80` / `bind *:443` 改为 `*:18080` / `*:18443`。
|
||||
39
ansible/files/01-08-haproxy/haproxy-proxy-http-tls.cfg
Normal file
39
ansible/files/01-08-haproxy/haproxy-proxy-http-tls.cfg
Normal file
@@ -0,0 +1,39 @@
|
||||
# 01-08 HAProxy - 健康检查升级(HTTP+TLS)+ PROXY Protocol
|
||||
# 组合:k3s_http 用 option httpchk,k3s_https 用 ssl-hello-chk,均带 send-proxy-v2
|
||||
# 文档:docs/01-08-openwrt-haproxy.md 第 5 节「健康检查与 PROXY 组合」
|
||||
global
|
||||
log /dev/log local0
|
||||
maxconn 4096
|
||||
|
||||
defaults
|
||||
mode http
|
||||
option httplog
|
||||
timeout connect 5s
|
||||
timeout client 30s
|
||||
timeout server 30s
|
||||
|
||||
frontend http_in
|
||||
bind *:80
|
||||
default_backend k3s_http
|
||||
|
||||
frontend https_in
|
||||
bind *:443
|
||||
mode tcp
|
||||
default_backend k3s_https
|
||||
|
||||
backend k3s_http
|
||||
option httpchk GET /
|
||||
balance roundrobin
|
||||
server ylc61 192.168.2.61:80 check send-proxy-v2
|
||||
server ylc62 192.168.2.62:80 check send-proxy-v2
|
||||
server ylc63 192.168.2.63:80 check send-proxy-v2
|
||||
server ylc64 192.168.2.64:80 check send-proxy-v2
|
||||
|
||||
backend k3s_https
|
||||
mode tcp
|
||||
option ssl-hello-chk
|
||||
balance roundrobin
|
||||
server ylc61 192.168.2.61:443 check send-proxy-v2
|
||||
server ylc62 192.168.2.62:443 check send-proxy-v2
|
||||
server ylc63 192.168.2.63:443 check send-proxy-v2
|
||||
server ylc64 192.168.2.64:443 check send-proxy-v2
|
||||
37
ansible/files/01-08-haproxy/haproxy-proxy.cfg
Normal file
37
ansible/files/01-08-haproxy/haproxy-proxy.cfg
Normal file
@@ -0,0 +1,37 @@
|
||||
# 01-08 HAProxy - 启用 PROXY Protocol(send-proxy-v2)
|
||||
# 用于 Traefik 获取真实客户端 IP,需配合 Traefik trustedIPs
|
||||
# 文档:docs/01-08-openwrt-haproxy.md 第 5 节
|
||||
global
|
||||
log /dev/log local0
|
||||
maxconn 4096
|
||||
|
||||
defaults
|
||||
mode http
|
||||
option httplog
|
||||
timeout connect 5s
|
||||
timeout client 30s
|
||||
timeout server 30s
|
||||
|
||||
frontend http_in
|
||||
bind *:80
|
||||
default_backend k3s_http
|
||||
|
||||
frontend https_in
|
||||
bind *:443
|
||||
mode tcp
|
||||
default_backend k3s_https
|
||||
|
||||
backend k3s_http
|
||||
balance roundrobin
|
||||
server ylc61 192.168.2.61:80 check send-proxy-v2
|
||||
server ylc62 192.168.2.62:80 check send-proxy-v2
|
||||
server ylc63 192.168.2.63:80 check send-proxy-v2
|
||||
server ylc64 192.168.2.64:80 check send-proxy-v2
|
||||
|
||||
backend k3s_https
|
||||
mode tcp
|
||||
balance roundrobin
|
||||
server ylc61 192.168.2.61:443 check send-proxy-v2
|
||||
server ylc62 192.168.2.62:443 check send-proxy-v2
|
||||
server ylc63 192.168.2.63:443 check send-proxy-v2
|
||||
server ylc64 192.168.2.64:443 check send-proxy-v2
|
||||
38
ansible/files/01-08-haproxy/haproxy.cfg
Normal file
38
ansible/files/01-08-haproxy/haproxy.cfg
Normal file
@@ -0,0 +1,38 @@
|
||||
# 01-08 OpenWrt HAProxy 负载均衡 - 基础配置
|
||||
# 文档:docs/01-08-openwrt-haproxy.md
|
||||
# 将 192.168.2.61~64 按实际 K3s 节点 IP 修改
|
||||
global
|
||||
log /dev/log local0
|
||||
maxconn 4096
|
||||
# 部分 OpenWrt 需 daemon / pidfile,按发行版调整;若无 /dev/log 可改 log 127.0.0.1 local0
|
||||
|
||||
defaults
|
||||
mode http
|
||||
option httplog
|
||||
timeout connect 5s
|
||||
timeout client 30s
|
||||
timeout server 30s
|
||||
|
||||
frontend http_in
|
||||
bind *:80
|
||||
default_backend k3s_http
|
||||
|
||||
frontend https_in
|
||||
bind *:443
|
||||
mode tcp
|
||||
default_backend k3s_https
|
||||
|
||||
backend k3s_http
|
||||
balance roundrobin
|
||||
server ylc61 192.168.2.61:80 check
|
||||
server ylc62 192.168.2.62:80 check
|
||||
server ylc63 192.168.2.63:80 check
|
||||
server ylc64 192.168.2.64:80 check
|
||||
|
||||
backend k3s_https
|
||||
mode tcp
|
||||
balance roundrobin
|
||||
server ylc61 192.168.2.61:443 check
|
||||
server ylc62 192.168.2.62:443 check
|
||||
server ylc63 192.168.2.63:443 check
|
||||
server ylc64 192.168.2.64:443 check
|
||||
37
ansible/files/cloudflare-tunnel/cloudflared.yaml
Normal file
37
ansible/files/cloudflare-tunnel/cloudflared.yaml
Normal file
@@ -0,0 +1,37 @@
|
||||
# docs/03-04-k3s-cloudflare-tunnel-配置接入.md — 替换 TUNNEL_TOKEN 后应用
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: cloudflared-credentials
|
||||
namespace: kube-system
|
||||
type: Opaque
|
||||
stringData:
|
||||
TUNNEL_TOKEN: "<YOUR_TUNNEL_TOKEN>"
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: cloudflared
|
||||
namespace: kube-system
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: cloudflared
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: cloudflared
|
||||
spec:
|
||||
containers:
|
||||
- name: cloudflared
|
||||
image: cloudflare/cloudflared:latest
|
||||
args:
|
||||
- tunnel
|
||||
- run
|
||||
env:
|
||||
- name: TUNNEL_TOKEN
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: cloudflared-credentials
|
||||
key: TUNNEL_TOKEN
|
||||
9
ansible/files/gitlab/README.md
Normal file
9
ansible/files/gitlab/README.md
Normal file
@@ -0,0 +1,9 @@
|
||||
# GitLab CI 示例(与 docs 对照)
|
||||
|
||||
| 文件 | 文档 |
|
||||
|------|------|
|
||||
| `gitlab-ci-minimal.example.yml` | `docs/05-04-k3s-配置gitlab-cicd.md` |
|
||||
| `gitlab-ci-multi-arch-deploy.example.yml` | `docs/05-04-k3s-配置gitlab-cicd.md` |
|
||||
| `gitlab-ci-runner-tags.example.yml` | `docs/05-03-k3s-安装gitlab-含runner.md` |
|
||||
|
||||
复制为 `.gitlab-ci.yml` 或 `include` 引用;变量与 Runner 以文档为准。
|
||||
20
ansible/files/gitlab/gitlab-ci-minimal.example.yml
Normal file
20
ansible/files/gitlab/gitlab-ci-minimal.example.yml
Normal file
@@ -0,0 +1,20 @@
|
||||
# docs/05-04-k3s-配置gitlab-cicd.md — 最小 .gitlab-ci.yml 示例
|
||||
stages:
|
||||
- lint
|
||||
- deploy
|
||||
|
||||
variables:
|
||||
KUBECONFIG: "/builds/${CI_PROJECT_PATH}/kubeconfig"
|
||||
|
||||
lint:
|
||||
stage: lint
|
||||
script:
|
||||
- yamllint manifests || true
|
||||
|
||||
deploy:
|
||||
stage: deploy
|
||||
script:
|
||||
- echo "$KUBE_CONFIG_CONTENT" > "$KUBECONFIG"
|
||||
- kubectl --kubeconfig="$KUBECONFIG" apply -f manifests/
|
||||
only:
|
||||
- main
|
||||
14
ansible/files/gitlab/gitlab-ci-multi-arch-deploy.example.yml
Normal file
14
ansible/files/gitlab/gitlab-ci-multi-arch-deploy.example.yml
Normal file
@@ -0,0 +1,14 @@
|
||||
# docs/05-04-k3s-配置gitlab-cicd.md — 多架构 Runner tags 示例
|
||||
deploy_x86:
|
||||
stage: deploy
|
||||
tags: [x86]
|
||||
script:
|
||||
- echo "$KUBE_CONFIG_CONTENT" > "$KUBECONFIG"
|
||||
- kubectl --kubeconfig="$KUBECONFIG" apply -f manifests/x86/
|
||||
|
||||
deploy_arm64:
|
||||
stage: deploy
|
||||
tags: [arm64]
|
||||
script:
|
||||
- echo "$KUBE_CONFIG_CONTENT" > "$KUBECONFIG"
|
||||
- kubectl --kubeconfig="$KUBECONFIG" apply -f manifests/arm64/
|
||||
15
ansible/files/gitlab/gitlab-ci-runner-tags.example.yml
Normal file
15
ansible/files/gitlab/gitlab-ci-runner-tags.example.yml
Normal file
@@ -0,0 +1,15 @@
|
||||
# docs/05-03-k3s-安装gitlab-含runner.md — Runner tag 与 job 对应示例
|
||||
build_x86:
|
||||
tags: [x86]
|
||||
script:
|
||||
- echo "build for x86"
|
||||
|
||||
build_arm64:
|
||||
tags: [arm64]
|
||||
script:
|
||||
- echo "build for arm64"
|
||||
|
||||
build_armv7:
|
||||
tags: [armv7]
|
||||
script:
|
||||
- echo "build for armv7"
|
||||
53
ansible/files/homer/homer.yaml
Normal file
53
ansible/files/homer/homer.yaml
Normal file
@@ -0,0 +1,53 @@
|
||||
# docs/05-01-k3s-部署homer首页面板.md — 按需修改 host
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: homer
|
||||
namespace: homer
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: homer
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: homer
|
||||
spec:
|
||||
containers:
|
||||
- name: homer
|
||||
image: b4bz/homer:latest
|
||||
ports:
|
||||
- containerPort: 8080
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: homer
|
||||
namespace: homer
|
||||
spec:
|
||||
selector:
|
||||
app: homer
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8080
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: homer
|
||||
namespace: homer
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- host: home.example.com
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: homer
|
||||
port:
|
||||
number: 80
|
||||
38
ansible/files/local-path-demo/local-path-pvc-demo.yaml
Normal file
38
ansible/files/local-path-demo/local-path-pvc-demo.yaml
Normal file
@@ -0,0 +1,38 @@
|
||||
# docs/03-05-k3s-local-path-pvc.md
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: local-pvc-demo
|
||||
namespace: default
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nginx-local-pvc-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nginx-local-pvc-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nginx-local-pvc-demo
|
||||
spec:
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx:alpine
|
||||
volumeMounts:
|
||||
- name: data
|
||||
mountPath: /usr/share/nginx/html
|
||||
volumes:
|
||||
- name: data
|
||||
persistentVolumeClaim:
|
||||
claimName: local-pvc-demo
|
||||
27
ansible/files/nfs-demo/nfs-pv-pvc-demo.yaml
Normal file
27
ansible/files/nfs-demo/nfs-pv-pvc-demo.yaml
Normal file
@@ -0,0 +1,27 @@
|
||||
# docs/03-06-k3s-使用nfs存储.md — 按环境修改 server/path
|
||||
apiVersion: v1
|
||||
kind: PersistentVolume
|
||||
metadata:
|
||||
name: nfs-pv-demo
|
||||
spec:
|
||||
capacity:
|
||||
storage: 20Gi
|
||||
accessModes:
|
||||
- ReadWriteMany
|
||||
persistentVolumeReclaimPolicy: Retain
|
||||
nfs:
|
||||
server: 192.168.2.22
|
||||
path: /data/nfs
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: nfs-pvc-demo
|
||||
namespace: default
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteMany
|
||||
resources:
|
||||
requests:
|
||||
storage: 5Gi
|
||||
volumeName: nfs-pv-demo
|
||||
115
ansible/files/nginx-matrix-tls/01-control-ingress.yaml
Normal file
115
ansible/files/nginx-matrix-tls/01-control-ingress.yaml
Normal file
@@ -0,0 +1,115 @@
|
||||
# 03-02 TLS: M1 控制节点 + Ingress,路径 /(根路径),域名 test01.jackadam.top
|
||||
# ConfigMap:首页 + default.conf(单文件 subPath 挂载,与 M2~M4 一致)
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nginx-m1-html
|
||||
namespace: default
|
||||
data:
|
||||
index.html: |
|
||||
<!DOCTYPE html>
|
||||
<html><head><meta charset="utf-8"><title>M1</title></head>
|
||||
<body><h1>M1</h1><p>控制节点 + Ingress</p><p><strong>Backend: M1</strong></p></body></html>
|
||||
default.conf: |
|
||||
server { listen 80 default_server; server_name _; root /usr/share/nginx/html; index index.html; location / { add_header X-Backend "M1"; try_files $uri $uri/ /index.html; } }
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nginx-m1
|
||||
namespace: default
|
||||
labels:
|
||||
app: nginx-m1
|
||||
matrix: "03-02-m1"
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nginx-m1
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nginx-m1
|
||||
spec:
|
||||
nodeSelector:
|
||||
node-role.kubernetes.io/control-plane: ""
|
||||
tolerations:
|
||||
- key: node-role.kubernetes.io/control-plane
|
||||
operator: Exists
|
||||
effect: NoSchedule
|
||||
volumes:
|
||||
- name: html
|
||||
configMap:
|
||||
name: nginx-m1-html
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx:alpine
|
||||
ports:
|
||||
- containerPort: 80
|
||||
volumeMounts:
|
||||
- name: html
|
||||
mountPath: /usr/share/nginx/html/index.html
|
||||
subPath: index.html
|
||||
readOnly: true
|
||||
- name: html
|
||||
mountPath: /etc/nginx/conf.d/default.conf
|
||||
subPath: default.conf
|
||||
readOnly: true
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nginx-m1
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nginx-m1
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 80
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nginx-m1
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: websecure
|
||||
traefik.ingress.kubernetes.io/router.tls.certresolver: cloudflare
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
- test01.jackadam.top
|
||||
rules:
|
||||
- host: test01.jackadam.top
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nginx-m1
|
||||
port:
|
||||
number: 80
|
||||
---
|
||||
# 03-02 HTTP-only:M1 路由(仅 web,无 TLS),与 nginx-m1 共用 Service
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nginx-m1-http
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- host: test01.jackadam.top
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nginx-m1
|
||||
port:
|
||||
number: 80
|
||||
98
ansible/files/nginx-matrix-tls/02-control-ingressroute.yaml
Normal file
98
ansible/files/nginx-matrix-tls/02-control-ingressroute.yaml
Normal file
@@ -0,0 +1,98 @@
|
||||
# 03-02 TLS: M2 控制节点 + IngressRoute,路径 /(根路径),域名 test02.jackadam.top
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nginx-m2-html
|
||||
namespace: default
|
||||
data:
|
||||
index.html: |
|
||||
<!DOCTYPE html>
|
||||
<html><head><meta charset="utf-8"><title>M2</title></head>
|
||||
<body><h1>M2</h1><p>控制节点 + IngressRoute</p></body></html>
|
||||
default.conf: |
|
||||
server { listen 80; server_name localhost; root /usr/share/nginx/html; index index.html; location / { add_header X-Backend "M2"; try_files $uri $uri/ /index.html; } }
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nginx-m2
|
||||
namespace: default
|
||||
labels:
|
||||
app: nginx-m2
|
||||
matrix: "03-02-m2"
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nginx-m2
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nginx-m2
|
||||
spec:
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc61
|
||||
volumes:
|
||||
- name: html
|
||||
configMap:
|
||||
name: nginx-m2-html
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx:alpine
|
||||
ports:
|
||||
- containerPort: 80
|
||||
volumeMounts:
|
||||
- name: html
|
||||
mountPath: /usr/share/nginx/html/index.html
|
||||
subPath: index.html
|
||||
readOnly: true
|
||||
- name: html
|
||||
mountPath: /etc/nginx/conf.d/default.conf
|
||||
subPath: default.conf
|
||||
readOnly: true
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nginx-m2
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nginx-m2
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 80
|
||||
---
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: IngressRoute
|
||||
metadata:
|
||||
name: nginx-m2
|
||||
namespace: default
|
||||
spec:
|
||||
entryPoints:
|
||||
- websecure
|
||||
routes:
|
||||
- match: Host(`test02.jackadam.top`)
|
||||
kind: Rule
|
||||
services:
|
||||
- name: nginx-m2
|
||||
port: 80
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
---
|
||||
# 03-02 HTTP-only:M2 路由(仅 web,无 TLS),与 nginx-m2 共用 Service
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: IngressRoute
|
||||
metadata:
|
||||
name: nginx-m2-http
|
||||
namespace: default
|
||||
spec:
|
||||
entryPoints:
|
||||
- web
|
||||
routes:
|
||||
- match: Host(`test02.jackadam.top`)
|
||||
kind: Rule
|
||||
services:
|
||||
- name: nginx-m2
|
||||
port: 80
|
||||
110
ansible/files/nginx-matrix-tls/03-worker-ingress.yaml
Normal file
110
ansible/files/nginx-matrix-tls/03-worker-ingress.yaml
Normal file
@@ -0,0 +1,110 @@
|
||||
# 03-02 TLS: M3 工作节点 + Ingress,路径 /(根路径),域名 test03.jackadam.top
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nginx-m3-html
|
||||
namespace: default
|
||||
data:
|
||||
index.html: |
|
||||
<!DOCTYPE html>
|
||||
<html><head><meta charset="utf-8"><title>M3</title></head>
|
||||
<body><h1>M3</h1><p>工作节点 + Ingress</p></body></html>
|
||||
default.conf: |
|
||||
server { listen 80; server_name localhost; root /usr/share/nginx/html; index index.html; location / { add_header X-Backend "M3"; try_files $uri $uri/ /index.html; } }
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nginx-m3
|
||||
namespace: default
|
||||
labels:
|
||||
app: nginx-m3
|
||||
matrix: "03-02-m3"
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nginx-m3
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nginx-m3
|
||||
spec:
|
||||
nodeSelector:
|
||||
node-role.kubernetes.io/worker: ""
|
||||
volumes:
|
||||
- name: html
|
||||
configMap:
|
||||
name: nginx-m3-html
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx:alpine
|
||||
ports:
|
||||
- containerPort: 80
|
||||
volumeMounts:
|
||||
- name: html
|
||||
mountPath: /usr/share/nginx/html/index.html
|
||||
subPath: index.html
|
||||
readOnly: true
|
||||
- name: html
|
||||
mountPath: /etc/nginx/conf.d/default.conf
|
||||
subPath: default.conf
|
||||
readOnly: true
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nginx-m3
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nginx-m3
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 80
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nginx-m3
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: websecure
|
||||
traefik.ingress.kubernetes.io/router.tls.certresolver: cloudflare
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
- test03.jackadam.top
|
||||
rules:
|
||||
- host: test03.jackadam.top
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nginx-m3
|
||||
port:
|
||||
number: 80
|
||||
---
|
||||
# 03-02 HTTP-only:M3 路由(仅 web,无 TLS),与 nginx-m3 共用 Service
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nginx-m3-http
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- host: test03.jackadam.top
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nginx-m3
|
||||
port:
|
||||
number: 80
|
||||
98
ansible/files/nginx-matrix-tls/04-worker-ingressroute.yaml
Normal file
98
ansible/files/nginx-matrix-tls/04-worker-ingressroute.yaml
Normal file
@@ -0,0 +1,98 @@
|
||||
# 03-02 TLS: M4 工作节点 + IngressRoute,路径 /(根路径),域名 test04.jackadam.top
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nginx-m4-html
|
||||
namespace: default
|
||||
data:
|
||||
index.html: |
|
||||
<!DOCTYPE html>
|
||||
<html><head><meta charset="utf-8"><title>M4</title></head>
|
||||
<body><h1>M4</h1><p>工作节点 + IngressRoute</p></body></html>
|
||||
default.conf: |
|
||||
server { listen 80; server_name localhost; root /usr/share/nginx/html; index index.html; location / { add_header X-Backend "M4"; try_files $uri $uri/ /index.html; } }
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nginx-m4
|
||||
namespace: default
|
||||
labels:
|
||||
app: nginx-m4
|
||||
matrix: "03-02-m4"
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nginx-m4
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nginx-m4
|
||||
spec:
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc64
|
||||
volumes:
|
||||
- name: html
|
||||
configMap:
|
||||
name: nginx-m4-html
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx:alpine
|
||||
ports:
|
||||
- containerPort: 80
|
||||
volumeMounts:
|
||||
- name: html
|
||||
mountPath: /usr/share/nginx/html/index.html
|
||||
subPath: index.html
|
||||
readOnly: true
|
||||
- name: html
|
||||
mountPath: /etc/nginx/conf.d/default.conf
|
||||
subPath: default.conf
|
||||
readOnly: true
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nginx-m4
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nginx-m4
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 80
|
||||
---
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: IngressRoute
|
||||
metadata:
|
||||
name: nginx-m4
|
||||
namespace: default
|
||||
spec:
|
||||
entryPoints:
|
||||
- websecure
|
||||
routes:
|
||||
- match: Host(`test04.jackadam.top`)
|
||||
kind: Rule
|
||||
services:
|
||||
- name: nginx-m4
|
||||
port: 80
|
||||
tls:
|
||||
certResolver: cloudflare
|
||||
---
|
||||
# 03-02 HTTP-only:M4 路由(仅 web,无 TLS),与 nginx-m4 共用 Service
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: IngressRoute
|
||||
metadata:
|
||||
name: nginx-m4-http
|
||||
namespace: default
|
||||
spec:
|
||||
entryPoints:
|
||||
- web
|
||||
routes:
|
||||
- match: Host(`test04.jackadam.top`)
|
||||
kind: Rule
|
||||
services:
|
||||
- name: nginx-m4
|
||||
port: 80
|
||||
100
ansible/files/nginx-matrix/01-control-ingress.yaml
Normal file
100
ansible/files/nginx-matrix/01-control-ingress.yaml
Normal file
@@ -0,0 +1,100 @@
|
||||
# 02-05: Nginx + 控制节点 + Ingress(M1)
|
||||
# 路径 /demo-m1,随机一台控制节点(nodeSelector + toleration,控制节点常有 NoSchedule 污点)
|
||||
# ConfigMap:首页 + default.conf(单文件 subPath 挂载,与 M2~M4 一致,便于 nginx 后续扩展)
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nginx-m1-html
|
||||
namespace: default
|
||||
data:
|
||||
index.html: |
|
||||
<!DOCTYPE html>
|
||||
<html><head><meta charset="utf-8"><title>M1</title></head>
|
||||
<body><h1>M1</h1><p>控制节点 + Ingress</p><p><strong>Backend: M1</strong></p></body></html>
|
||||
default.conf: |
|
||||
server { listen 80 default_server; server_name _; root /usr/share/nginx/html; index index.html; location / { add_header X-Backend "M1"; try_files $uri $uri/ /index.html; } }
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nginx-m1
|
||||
namespace: default
|
||||
labels:
|
||||
app: nginx-m1
|
||||
matrix: "02-05-m1"
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nginx-m1
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nginx-m1
|
||||
spec:
|
||||
nodeSelector:
|
||||
node-role.kubernetes.io/control-plane: ""
|
||||
tolerations:
|
||||
- key: node-role.kubernetes.io/control-plane
|
||||
operator: Exists
|
||||
effect: NoSchedule
|
||||
volumes:
|
||||
- name: html
|
||||
configMap:
|
||||
name: nginx-m1-html
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx:alpine
|
||||
ports:
|
||||
- containerPort: 80
|
||||
volumeMounts:
|
||||
- name: html
|
||||
mountPath: /usr/share/nginx/html/index.html
|
||||
subPath: index.html
|
||||
readOnly: true
|
||||
- name: html
|
||||
mountPath: /etc/nginx/conf.d/default.conf
|
||||
subPath: default.conf
|
||||
readOnly: true
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nginx-m1
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nginx-m1
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 80
|
||||
---
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: Middleware
|
||||
metadata:
|
||||
name: stripprefix-m1
|
||||
namespace: default
|
||||
spec:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- /demo-m1
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nginx-m1
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.middlewares: default-stripprefix-m1@kubernetescrd
|
||||
spec:
|
||||
rules:
|
||||
- http:
|
||||
paths:
|
||||
- path: /demo-m1
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nginx-m1
|
||||
port:
|
||||
number: 80
|
||||
94
ansible/files/nginx-matrix/02-control-ingressroute.yaml
Normal file
94
ansible/files/nginx-matrix/02-control-ingressroute.yaml
Normal file
@@ -0,0 +1,94 @@
|
||||
# 03-02: Nginx + 控制节点 + IngressRoute(M2)
|
||||
# 路径 /demo-m2,指定一台控制节点(按实际 FQDN 修改 kubernetes.io/hostname)
|
||||
# ConfigMap:首页 + default.conf,X-Backend: M2 便于区分
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nginx-m2-html
|
||||
namespace: default
|
||||
data:
|
||||
index.html: |
|
||||
<!DOCTYPE html>
|
||||
<html><head><meta charset="utf-8"><title>M2</title></head>
|
||||
<body><h1>M2</h1><p>控制节点 + IngressRoute</p></body></html>
|
||||
default.conf: |
|
||||
server { listen 80; server_name localhost; root /usr/share/nginx/html; index index.html; location / { add_header X-Backend "M2"; try_files $uri $uri/ /index.html; } }
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nginx-m2
|
||||
namespace: default
|
||||
labels:
|
||||
app: nginx-m2
|
||||
matrix: "02-05-m2"
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nginx-m2
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nginx-m2
|
||||
spec:
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc61
|
||||
volumes:
|
||||
- name: html
|
||||
configMap:
|
||||
name: nginx-m2-html
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx:alpine
|
||||
ports:
|
||||
- containerPort: 80
|
||||
volumeMounts:
|
||||
- name: html
|
||||
mountPath: /usr/share/nginx/html/index.html
|
||||
subPath: index.html
|
||||
readOnly: true
|
||||
- name: html
|
||||
mountPath: /etc/nginx/conf.d/default.conf
|
||||
subPath: default.conf
|
||||
readOnly: true
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nginx-m2
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nginx-m2
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 80
|
||||
---
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: Middleware
|
||||
metadata:
|
||||
name: stripprefix-m2
|
||||
namespace: default
|
||||
spec:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- /demo-m2
|
||||
---
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: IngressRoute
|
||||
metadata:
|
||||
name: nginx-m2
|
||||
namespace: default
|
||||
spec:
|
||||
entryPoints:
|
||||
- web
|
||||
routes:
|
||||
- match: PathPrefix(`/demo-m2`)
|
||||
kind: Rule
|
||||
middlewares:
|
||||
- name: stripprefix-m2
|
||||
services:
|
||||
- name: nginx-m2
|
||||
port: 80
|
||||
96
ansible/files/nginx-matrix/03-worker-ingress.yaml
Normal file
96
ansible/files/nginx-matrix/03-worker-ingress.yaml
Normal file
@@ -0,0 +1,96 @@
|
||||
# 03-03: Nginx + 工作节点 + Ingress(M3)
|
||||
# 路径 /demo-m3,随机一台工作节点(nodeSelector: node-role.kubernetes.io/worker)
|
||||
# ConfigMap:首页 + default.conf,X-Backend: M3 便于区分
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nginx-m3-html
|
||||
namespace: default
|
||||
data:
|
||||
index.html: |
|
||||
<!DOCTYPE html>
|
||||
<html><head><meta charset="utf-8"><title>M3</title></head>
|
||||
<body><h1>M3</h1><p>工作节点 + Ingress</p></body></html>
|
||||
default.conf: |
|
||||
server { listen 80; server_name localhost; root /usr/share/nginx/html; index index.html; location / { add_header X-Backend "M3"; try_files $uri $uri/ /index.html; } }
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nginx-m3
|
||||
namespace: default
|
||||
labels:
|
||||
app: nginx-m3
|
||||
matrix: "02-05-m3"
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nginx-m3
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nginx-m3
|
||||
spec:
|
||||
nodeSelector:
|
||||
node-role.kubernetes.io/worker: ""
|
||||
volumes:
|
||||
- name: html
|
||||
configMap:
|
||||
name: nginx-m3-html
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx:alpine
|
||||
ports:
|
||||
- containerPort: 80
|
||||
volumeMounts:
|
||||
- name: html
|
||||
mountPath: /usr/share/nginx/html/index.html
|
||||
subPath: index.html
|
||||
readOnly: true
|
||||
- name: html
|
||||
mountPath: /etc/nginx/conf.d/default.conf
|
||||
subPath: default.conf
|
||||
readOnly: true
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nginx-m3
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nginx-m3
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 80
|
||||
---
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: Middleware
|
||||
metadata:
|
||||
name: stripprefix-m3
|
||||
namespace: default
|
||||
spec:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- /demo-m3
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nginx-m3
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.middlewares: default-stripprefix-m3@kubernetescrd
|
||||
spec:
|
||||
rules:
|
||||
- http:
|
||||
paths:
|
||||
- path: /demo-m3
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nginx-m3
|
||||
port:
|
||||
number: 80
|
||||
94
ansible/files/nginx-matrix/04-worker-ingressroute.yaml
Normal file
94
ansible/files/nginx-matrix/04-worker-ingressroute.yaml
Normal file
@@ -0,0 +1,94 @@
|
||||
# 03-04: Nginx + 工作节点 + IngressRoute(M4)
|
||||
# 路径 /demo-m4,指定一台工作节点(按实际 FQDN 修改 kubernetes.io/hostname)
|
||||
# ConfigMap:首页 + default.conf,X-Backend: M4 便于区分
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nginx-m4-html
|
||||
namespace: default
|
||||
data:
|
||||
index.html: |
|
||||
<!DOCTYPE html>
|
||||
<html><head><meta charset="utf-8"><title>M4</title></head>
|
||||
<body><h1>M4</h1><p>工作节点 + IngressRoute</p></body></html>
|
||||
default.conf: |
|
||||
server { listen 80; server_name localhost; root /usr/share/nginx/html; index index.html; location / { add_header X-Backend "M4"; try_files $uri $uri/ /index.html; } }
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nginx-m4
|
||||
namespace: default
|
||||
labels:
|
||||
app: nginx-m4
|
||||
matrix: "02-05-m4"
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nginx-m4
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nginx-m4
|
||||
spec:
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc64
|
||||
volumes:
|
||||
- name: html
|
||||
configMap:
|
||||
name: nginx-m4-html
|
||||
containers:
|
||||
- name: nginx
|
||||
image: nginx:alpine
|
||||
ports:
|
||||
- containerPort: 80
|
||||
volumeMounts:
|
||||
- name: html
|
||||
mountPath: /usr/share/nginx/html/index.html
|
||||
subPath: index.html
|
||||
readOnly: true
|
||||
- name: html
|
||||
mountPath: /etc/nginx/conf.d/default.conf
|
||||
subPath: default.conf
|
||||
readOnly: true
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nginx-m4
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nginx-m4
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 80
|
||||
---
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: Middleware
|
||||
metadata:
|
||||
name: stripprefix-m4
|
||||
namespace: default
|
||||
spec:
|
||||
stripPrefix:
|
||||
prefixes:
|
||||
- /demo-m4
|
||||
---
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: IngressRoute
|
||||
metadata:
|
||||
name: nginx-m4
|
||||
namespace: default
|
||||
spec:
|
||||
entryPoints:
|
||||
- web
|
||||
routes:
|
||||
- match: PathPrefix(`/demo-m4`)
|
||||
kind: Rule
|
||||
middlewares:
|
||||
- name: stripprefix-m4
|
||||
services:
|
||||
- name: nginx-m4
|
||||
port: 80
|
||||
12
ansible/files/nginx-matrix/README.md
Normal file
12
ansible/files/nginx-matrix/README.md
Normal file
@@ -0,0 +1,12 @@
|
||||
# Nginx 矩阵 manifests
|
||||
|
||||
用于 `ansible/playbooks/nginx-matrix-deploy.yml` 一键部署。
|
||||
|
||||
| 文件 | 场景 | 路径 | 节点 |
|
||||
|------|------|------|------|
|
||||
| 01-control-ingress.yaml | M1 控制+Ingress | /demo-m1 | 无 nodeSelector |
|
||||
| 02-control-ingressroute.yaml | M2 控制+IngressRoute | /demo-m2 | 无 nodeSelector |
|
||||
| 03-worker-ingress.yaml | M3 工作+Ingress | /demo-m3 | nodeSelector=worker(随机) |
|
||||
| 04-worker-ingressroute.yaml | M4 工作+IngressRoute | /demo-m4 | nodeSelector=ylc64 |
|
||||
|
||||
M4 默认指定 ylc64,M3 随机工作节点;按实际修改。
|
||||
54
ansible/files/nodejs-demo/04-01-nodejs-demo.yaml
Normal file
54
ansible/files/nodejs-demo/04-01-nodejs-demo.yaml
Normal file
@@ -0,0 +1,54 @@
|
||||
# 对应文档:docs/04-01-k3s-nodejs-高级部署.md
|
||||
# 累积:基线(Deployment + Service + Ingress)
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18-alpine
|
||||
command: ["node", "-e", "require('http').createServer((req,res)=>res.end('Hello World from Node.js')).listen(3000)"]
|
||||
ports:
|
||||
- containerPort: 3000
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 3000
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- http:
|
||||
paths:
|
||||
- path: /node
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
58
ansible/files/nodejs-demo/04-02-nodejs-demo.yaml
Normal file
58
ansible/files/nodejs-demo/04-02-nodejs-demo.yaml
Normal file
@@ -0,0 +1,58 @@
|
||||
# 对应文档:docs/04-02-nodejs-镜像与运行命令.md
|
||||
# 累积:04-01 + 固定镜像 tag、imagePullPolicy、command/args
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18.20-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
command: ["node"]
|
||||
args:
|
||||
- "-e"
|
||||
- "require('http').createServer((req,res)=>res.end('Hello from pinned image')).listen(3000)"
|
||||
ports:
|
||||
- containerPort: 3000
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 3000
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- http:
|
||||
paths:
|
||||
- path: /node
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
75
ansible/files/nodejs-demo/04-03-nodejs-demo.yaml
Normal file
75
ansible/files/nodejs-demo/04-03-nodejs-demo.yaml
Normal file
@@ -0,0 +1,75 @@
|
||||
# 对应文档:docs/04-03-nodejs-环境变量与配置注入.md
|
||||
# 累积:04-02 + ConfigMap + 通过 env 注入 APP_MSG(镜像仍用 18.20-alpine 与 04-02 一致)
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nodejs-demo-config
|
||||
namespace: default
|
||||
data:
|
||||
APP_MSG: "Hello from ConfigMap"
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18.20-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
env:
|
||||
- name: APP_MSG
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: nodejs-demo-config
|
||||
key: APP_MSG
|
||||
command:
|
||||
- node
|
||||
- "-e"
|
||||
- |
|
||||
const http=require('http');
|
||||
const msg=process.env.APP_MSG||'no env';
|
||||
http.createServer((q,s)=>s.end(msg)).listen(3000);
|
||||
ports:
|
||||
- containerPort: 3000
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 3000
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- http:
|
||||
paths:
|
||||
- path: /node
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
75
ansible/files/nodejs-demo/04-04-nodejs-demo.yaml
Normal file
75
ansible/files/nodejs-demo/04-04-nodejs-demo.yaml
Normal file
@@ -0,0 +1,75 @@
|
||||
# 对应文档:docs/04-04-nodejs-端口与Service.md
|
||||
# 累积:04-03 + 容器与进程改监听 8080,Service targetPort 对齐
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nodejs-demo-config
|
||||
namespace: default
|
||||
data:
|
||||
APP_MSG: "Hello from ConfigMap"
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18.20-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
env:
|
||||
- name: APP_MSG
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: nodejs-demo-config
|
||||
key: APP_MSG
|
||||
command:
|
||||
- node
|
||||
- "-e"
|
||||
- |
|
||||
const http=require('http');
|
||||
const msg=process.env.APP_MSG||'no env';
|
||||
http.createServer((q,s)=>s.end(msg)).listen(8080);
|
||||
ports:
|
||||
- containerPort: 8080
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8080
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- http:
|
||||
paths:
|
||||
- path: /node
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
82
ansible/files/nodejs-demo/04-05-nodejs-demo.yaml
Normal file
82
ansible/files/nodejs-demo/04-05-nodejs-demo.yaml
Normal file
@@ -0,0 +1,82 @@
|
||||
# 对应文档:docs/04-05-nodejs-资源请求与限制.md
|
||||
# 累积:04-04 + resources.requests/limits
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nodejs-demo-config
|
||||
namespace: default
|
||||
data:
|
||||
APP_MSG: "Hello from ConfigMap"
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18.20-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
env:
|
||||
- name: APP_MSG
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: nodejs-demo-config
|
||||
key: APP_MSG
|
||||
command:
|
||||
- node
|
||||
- "-e"
|
||||
- |
|
||||
const http=require('http');
|
||||
const msg=process.env.APP_MSG||'no env';
|
||||
http.createServer((q,s)=>s.end(msg)).listen(8080);
|
||||
ports:
|
||||
- containerPort: 8080
|
||||
resources:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8080
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- http:
|
||||
paths:
|
||||
- path: /node
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
94
ansible/files/nodejs-demo/04-06-nodejs-demo.yaml
Normal file
94
ansible/files/nodejs-demo/04-06-nodejs-demo.yaml
Normal file
@@ -0,0 +1,94 @@
|
||||
# 对应文档:docs/04-06-nodejs-探针与健康检查.md
|
||||
# 累积:04-05 + livenessProbe/readinessProbe(端口 8080,路径 /)
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nodejs-demo-config
|
||||
namespace: default
|
||||
data:
|
||||
APP_MSG: "Hello from ConfigMap"
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18.20-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
env:
|
||||
- name: APP_MSG
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: nodejs-demo-config
|
||||
key: APP_MSG
|
||||
command:
|
||||
- node
|
||||
- "-e"
|
||||
- |
|
||||
const http=require('http');
|
||||
const msg=process.env.APP_MSG||'no env';
|
||||
http.createServer((q,s)=>s.end(msg)).listen(8080);
|
||||
ports:
|
||||
- containerPort: 8080
|
||||
resources:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 3
|
||||
periodSeconds: 10
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 2
|
||||
periodSeconds: 5
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8080
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- http:
|
||||
paths:
|
||||
- path: /node
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
96
ansible/files/nodejs-demo/04-07-nodejs-demo.yaml
Normal file
96
ansible/files/nodejs-demo/04-07-nodejs-demo.yaml
Normal file
@@ -0,0 +1,96 @@
|
||||
# 对应文档:docs/04-07-nodejs-调度与亲和.md
|
||||
# 累积:04-06 + nodeSelector(默认 ylc62,请改为本集群节点短主机名)
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nodejs-demo-config
|
||||
namespace: default
|
||||
data:
|
||||
APP_MSG: "Hello from ConfigMap"
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc62
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18.20-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
env:
|
||||
- name: APP_MSG
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: nodejs-demo-config
|
||||
key: APP_MSG
|
||||
command:
|
||||
- node
|
||||
- "-e"
|
||||
- |
|
||||
const http=require('http');
|
||||
const msg=process.env.APP_MSG||'no env';
|
||||
http.createServer((q,s)=>s.end(msg)).listen(8080);
|
||||
ports:
|
||||
- containerPort: 8080
|
||||
resources:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 3
|
||||
periodSeconds: 10
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 2
|
||||
periodSeconds: 5
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8080
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- http:
|
||||
paths:
|
||||
- path: /node
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
109
ansible/files/nodejs-demo/04-08-nodejs-demo.yaml
Normal file
109
ansible/files/nodejs-demo/04-08-nodejs-demo.yaml
Normal file
@@ -0,0 +1,109 @@
|
||||
# 对应文档:docs/04-08-nodejs-安全上下文.md
|
||||
# 累积:04-07 + pod securityContext.fsGroup、容器 securityContext、只读根、/tmp emptyDir
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nodejs-demo-config
|
||||
namespace: default
|
||||
data:
|
||||
APP_MSG: "Hello from ConfigMap"
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc62
|
||||
securityContext:
|
||||
fsGroup: 1000
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18.20-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
securityContext:
|
||||
allowPrivilegeEscalation: false
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
readOnlyRootFilesystem: true
|
||||
env:
|
||||
- name: APP_MSG
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: nodejs-demo-config
|
||||
key: APP_MSG
|
||||
command:
|
||||
- node
|
||||
- "-e"
|
||||
- |
|
||||
const http=require('http');
|
||||
const msg=process.env.APP_MSG||'no env';
|
||||
http.createServer((q,s)=>s.end(msg)).listen(8080);
|
||||
ports:
|
||||
- containerPort: 8080
|
||||
resources:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 3
|
||||
periodSeconds: 10
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 2
|
||||
periodSeconds: 5
|
||||
volumeMounts:
|
||||
- name: tmp
|
||||
mountPath: /tmp
|
||||
volumes:
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8080
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- http:
|
||||
paths:
|
||||
- path: /node
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
127
ansible/files/nodejs-demo/04-09-nodejs-demo.yaml
Normal file
127
ansible/files/nodejs-demo/04-09-nodejs-demo.yaml
Normal file
@@ -0,0 +1,127 @@
|
||||
# 对应文档:docs/04-09-nodejs-存储与卷.md
|
||||
# 累积:04-08 + PVC nodejs-demo-data(默认 storageClassName: local-path,可按集群改为 longhorn 等)+ 挂载 /data
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: nodejs-demo-data
|
||||
namespace: default
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
storageClassName: local-path
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nodejs-demo-config
|
||||
namespace: default
|
||||
data:
|
||||
APP_MSG: "Hello from ConfigMap"
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc62
|
||||
securityContext:
|
||||
fsGroup: 1000
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18.20-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
securityContext:
|
||||
allowPrivilegeEscalation: false
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
readOnlyRootFilesystem: true
|
||||
env:
|
||||
- name: APP_MSG
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: nodejs-demo-config
|
||||
key: APP_MSG
|
||||
command:
|
||||
- node
|
||||
- "-e"
|
||||
- |
|
||||
const http=require('http');
|
||||
const msg=process.env.APP_MSG||'no env';
|
||||
http.createServer((q,s)=>s.end(msg)).listen(8080);
|
||||
ports:
|
||||
- containerPort: 8080
|
||||
resources:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 3
|
||||
periodSeconds: 10
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 2
|
||||
periodSeconds: 5
|
||||
volumeMounts:
|
||||
- name: tmp
|
||||
mountPath: /tmp
|
||||
- name: data
|
||||
mountPath: /data
|
||||
volumes:
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
- name: data
|
||||
persistentVolumeClaim:
|
||||
claimName: nodejs-demo-data
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8080
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- http:
|
||||
paths:
|
||||
- path: /node
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
128
ansible/files/nodejs-demo/04-10-nodejs-demo.yaml
Normal file
128
ansible/files/nodejs-demo/04-10-nodejs-demo.yaml
Normal file
@@ -0,0 +1,128 @@
|
||||
# 对应文档:docs/04-10-nodejs-Ingress与Traefik.md
|
||||
# 累积:04-09 + Ingress 增加 host、path 改为 /api(访问需 Host: app.example.local)
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: nodejs-demo-data
|
||||
namespace: default
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
storageClassName: local-path
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nodejs-demo-config
|
||||
namespace: default
|
||||
data:
|
||||
APP_MSG: "Hello from ConfigMap"
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc62
|
||||
securityContext:
|
||||
fsGroup: 1000
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18.20-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
securityContext:
|
||||
allowPrivilegeEscalation: false
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
readOnlyRootFilesystem: true
|
||||
env:
|
||||
- name: APP_MSG
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: nodejs-demo-config
|
||||
key: APP_MSG
|
||||
command:
|
||||
- node
|
||||
- "-e"
|
||||
- |
|
||||
const http=require('http');
|
||||
const msg=process.env.APP_MSG||'no env';
|
||||
http.createServer((q,s)=>s.end(msg)).listen(8080);
|
||||
ports:
|
||||
- containerPort: 8080
|
||||
resources:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 3
|
||||
periodSeconds: 10
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 2
|
||||
periodSeconds: 5
|
||||
volumeMounts:
|
||||
- name: tmp
|
||||
mountPath: /tmp
|
||||
- name: data
|
||||
mountPath: /data
|
||||
volumes:
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
- name: data
|
||||
persistentVolumeClaim:
|
||||
claimName: nodejs-demo-data
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8080
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- host: app.example.local
|
||||
http:
|
||||
paths:
|
||||
- path: /api
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
133
ansible/files/nodejs-demo/04-11-nodejs-demo.yaml
Normal file
133
ansible/files/nodejs-demo/04-11-nodejs-demo.yaml
Normal file
@@ -0,0 +1,133 @@
|
||||
# 对应文档:docs/04-11-nodejs-副本与滚动发布.md
|
||||
# 累积:04-10 + replicas: 3 + RollingUpdate(maxSurge:1 maxUnavailable:0)
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: nodejs-demo-data
|
||||
namespace: default
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
storageClassName: local-path
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nodejs-demo-config
|
||||
namespace: default
|
||||
data:
|
||||
APP_MSG: "Hello from ConfigMap"
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 3
|
||||
strategy:
|
||||
type: RollingUpdate
|
||||
rollingUpdate:
|
||||
maxSurge: 1
|
||||
maxUnavailable: 0
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc62
|
||||
securityContext:
|
||||
fsGroup: 1000
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18.20-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
securityContext:
|
||||
allowPrivilegeEscalation: false
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
readOnlyRootFilesystem: true
|
||||
env:
|
||||
- name: APP_MSG
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: nodejs-demo-config
|
||||
key: APP_MSG
|
||||
command:
|
||||
- node
|
||||
- "-e"
|
||||
- |
|
||||
const http=require('http');
|
||||
const msg=process.env.APP_MSG||'no env';
|
||||
http.createServer((q,s)=>s.end(msg)).listen(8080);
|
||||
ports:
|
||||
- containerPort: 8080
|
||||
resources:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 3
|
||||
periodSeconds: 10
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 2
|
||||
periodSeconds: 5
|
||||
volumeMounts:
|
||||
- name: tmp
|
||||
mountPath: /tmp
|
||||
- name: data
|
||||
mountPath: /data
|
||||
volumes:
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
- name: data
|
||||
persistentVolumeClaim:
|
||||
claimName: nodejs-demo-data
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8080
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- host: app.example.local
|
||||
http:
|
||||
paths:
|
||||
- path: /api
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
140
ansible/files/nodejs-demo/04-12-nodejs-demo.yaml
Normal file
140
ansible/files/nodejs-demo/04-12-nodejs-demo.yaml
Normal file
@@ -0,0 +1,140 @@
|
||||
# 对应文档:docs/04-12-nodejs-TLS与证书.md
|
||||
# 累积:04-11 + Ingress TLS(websecure、secretName: nodejs-demo-tls)
|
||||
# 应用前请先创建 TLS Secret,例如:
|
||||
# kubectl create secret tls nodejs-demo-tls --cert=fullchain.pem --key=privkey.pem -n default
|
||||
# 证书 SAN 须覆盖 app.example.local(与 rules.host / tls.hosts 一致)
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: nodejs-demo-data
|
||||
namespace: default
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
storageClassName: local-path
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nodejs-demo-config
|
||||
namespace: default
|
||||
data:
|
||||
APP_MSG: "Hello from ConfigMap"
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 3
|
||||
strategy:
|
||||
type: RollingUpdate
|
||||
rollingUpdate:
|
||||
maxSurge: 1
|
||||
maxUnavailable: 0
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc62
|
||||
securityContext:
|
||||
fsGroup: 1000
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18.20-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
securityContext:
|
||||
allowPrivilegeEscalation: false
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
readOnlyRootFilesystem: true
|
||||
env:
|
||||
- name: APP_MSG
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: nodejs-demo-config
|
||||
key: APP_MSG
|
||||
command:
|
||||
- node
|
||||
- "-e"
|
||||
- |
|
||||
const http=require('http');
|
||||
const msg=process.env.APP_MSG||'no env';
|
||||
http.createServer((q,s)=>s.end(msg)).listen(8080);
|
||||
ports:
|
||||
- containerPort: 8080
|
||||
resources:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 3
|
||||
periodSeconds: 10
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 2
|
||||
periodSeconds: 5
|
||||
volumeMounts:
|
||||
- name: tmp
|
||||
mountPath: /tmp
|
||||
- name: data
|
||||
mountPath: /data
|
||||
volumes:
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
- name: data
|
||||
persistentVolumeClaim:
|
||||
claimName: nodejs-demo-data
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8080
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: websecure
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
- app.example.local
|
||||
secretName: nodejs-demo-tls
|
||||
rules:
|
||||
- host: app.example.local
|
||||
http:
|
||||
paths:
|
||||
- path: /api
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
157
ansible/files/nodejs-demo/04-13-nodejs-demo.yaml
Normal file
157
ansible/files/nodejs-demo/04-13-nodejs-demo.yaml
Normal file
@@ -0,0 +1,157 @@
|
||||
# 对应文档:docs/04-13-nodejs-HPA.md
|
||||
# 累积:04-12 + HorizontalPodAutoscaler(CPU 50%,min 1 max 5)
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: nodejs-demo-data
|
||||
namespace: default
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
storageClassName: local-path
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: nodejs-demo-config
|
||||
namespace: default
|
||||
data:
|
||||
APP_MSG: "Hello from ConfigMap"
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
replicas: 3
|
||||
strategy:
|
||||
type: RollingUpdate
|
||||
rollingUpdate:
|
||||
maxSurge: 1
|
||||
maxUnavailable: 0
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nodejs-demo
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nodejs-demo
|
||||
spec:
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc62
|
||||
securityContext:
|
||||
fsGroup: 1000
|
||||
containers:
|
||||
- name: nodejs-demo
|
||||
image: node:18.20-alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
securityContext:
|
||||
allowPrivilegeEscalation: false
|
||||
runAsNonRoot: true
|
||||
runAsUser: 1000
|
||||
readOnlyRootFilesystem: true
|
||||
env:
|
||||
- name: APP_MSG
|
||||
valueFrom:
|
||||
configMapKeyRef:
|
||||
name: nodejs-demo-config
|
||||
key: APP_MSG
|
||||
command:
|
||||
- node
|
||||
- "-e"
|
||||
- |
|
||||
const http=require('http');
|
||||
const msg=process.env.APP_MSG||'no env';
|
||||
http.createServer((q,s)=>s.end(msg)).listen(8080);
|
||||
ports:
|
||||
- containerPort: 8080
|
||||
resources:
|
||||
requests:
|
||||
cpu: "50m"
|
||||
memory: "64Mi"
|
||||
limits:
|
||||
cpu: "500m"
|
||||
memory: "256Mi"
|
||||
livenessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 3
|
||||
periodSeconds: 10
|
||||
readinessProbe:
|
||||
httpGet:
|
||||
path: /
|
||||
port: 8080
|
||||
initialDelaySeconds: 2
|
||||
periodSeconds: 5
|
||||
volumeMounts:
|
||||
- name: tmp
|
||||
mountPath: /tmp
|
||||
- name: data
|
||||
mountPath: /data
|
||||
volumes:
|
||||
- name: tmp
|
||||
emptyDir: {}
|
||||
- name: data
|
||||
persistentVolumeClaim:
|
||||
claimName: nodejs-demo-data
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: nodejs-demo
|
||||
ports:
|
||||
- port: 80
|
||||
targetPort: 8080
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: websecure
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
- app.example.local
|
||||
secretName: nodejs-demo-tls
|
||||
rules:
|
||||
- host: app.example.local
|
||||
http:
|
||||
paths:
|
||||
- path: /api
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: nodejs-demo
|
||||
port:
|
||||
number: 80
|
||||
---
|
||||
apiVersion: autoscaling/v2
|
||||
kind: HorizontalPodAutoscaler
|
||||
metadata:
|
||||
name: nodejs-demo
|
||||
namespace: default
|
||||
spec:
|
||||
scaleTargetRef:
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
name: nodejs-demo
|
||||
minReplicas: 1
|
||||
maxReplicas: 5
|
||||
metrics:
|
||||
- type: Resource
|
||||
resource:
|
||||
name: cpu
|
||||
target:
|
||||
type: Utilization
|
||||
averageUtilization: 50
|
||||
42
ansible/files/nodejs-demo/README.md
Normal file
42
ansible/files/nodejs-demo/README.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# Node.js demo 清单(与 docs/04-01~04-14 对齐)
|
||||
|
||||
**唯一真源**:本目录下 YAML 与 `docs/` 中说明一致;文档内不重复贴全文,避免漂移。
|
||||
|
||||
## 累积规则
|
||||
|
||||
- `04-0N-nodejs-demo.yaml` 表示:从 `04-01` 起顺序做完 **04-01~04-0N** 各篇能力后的 **一份** 可 `kubectl apply -f` 的完整状态(多资源用 `---` 分隔)。
|
||||
- **可直接跳到最后一份** 做实验,不必逐文件 apply;若要理解每步增量,可按编号顺序阅读文档并对照相邻两个 YAML 的差异。
|
||||
- **04-14**(GitOps/CI)无独立清单,见 `docs/04-14-nodejs-GitOps与CI流水线.md` 与 `docs/05-04-k3s-配置gitlab-cicd.md`、`docs/03-09-k3s-gitops-集群配置管理.md`。
|
||||
|
||||
## 文件与文档对照
|
||||
|
||||
| 文件 | 文档 | 备注 |
|
||||
|------|------|------|
|
||||
| `04-01-nodejs-demo.yaml` | `docs/04-01-k3s-nodejs-高级部署.md` | 基线:3000、`/node`、无 host |
|
||||
| `04-02-nodejs-demo.yaml` | `docs/04-02-nodejs-镜像与运行命令.md` | 固定镜像 tag、`imagePullPolicy` |
|
||||
| `04-03-nodejs-demo.yaml` | `docs/04-03-nodejs-环境变量与配置注入.md` | + ConfigMap;Secret 示例见文末 `nodejs-demo-secret.example.yaml` |
|
||||
| `04-04-nodejs-demo.yaml` | `docs/04-04-nodejs-端口与Service.md` | 监听改 **8080**(自 04-04 起探针与后续均用 8080) |
|
||||
| `04-05-nodejs-demo.yaml` | `docs/04-05-nodejs-资源请求与限制.md` | + resources |
|
||||
| `04-06-nodejs-demo.yaml` | `docs/04-06-nodejs-探针与健康检查.md` | + 探针 |
|
||||
| `04-07-nodejs-demo.yaml` | `docs/04-07-nodejs-调度与亲和.md` | + `nodeSelector`(默认 **ylc62**,请改为本机节点名) |
|
||||
| `04-08-nodejs-demo.yaml` | `docs/04-08-nodejs-安全上下文.md` | + 非 root、只读根、`/tmp` emptyDir |
|
||||
| `04-09-nodejs-demo.yaml` | `docs/04-09-nodejs-存储与卷.md` | + PVC `nodejs-demo-data`(默认 **local-path**) |
|
||||
| `04-10-nodejs-demo.yaml` | `docs/04-10-nodejs-Ingress与Traefik.md` | Ingress:`host` + `/api`,curl 需 **Host** |
|
||||
| `04-11-nodejs-demo.yaml` | `docs/04-11-nodejs-副本与滚动发布.md` | replicas=3 + RollingUpdate |
|
||||
| `04-12-nodejs-demo.yaml` | `docs/04-12-nodejs-TLS与证书.md` | **websecure** + TLS;须先创建 `nodejs-demo-tls` Secret |
|
||||
| `04-13-nodejs-demo.yaml` | `docs/04-13-nodejs-HPA.md` | + HPA(需 metrics-server) |
|
||||
|
||||
## 应用方式
|
||||
|
||||
```bash
|
||||
# 仓库根目录
|
||||
kubectl apply -f ansible/files/nodejs-demo/04-01-nodejs-demo.yaml
|
||||
```
|
||||
|
||||
或使用 Ansible:`ansible/playbooks/nodejs-demo-apply.yml`,变量 `nodejs_demo_manifest` 指定文件名。
|
||||
|
||||
## dry-run
|
||||
|
||||
```bash
|
||||
kubectl apply --dry-run=client -f ansible/files/nodejs-demo/04-01-nodejs-demo.yaml
|
||||
```
|
||||
@@ -0,0 +1,8 @@
|
||||
# 示例:勿将真实密钥提交到公开仓库。对应 docs/04-03 Secret 示意。
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: nodejs-demo-secret
|
||||
namespace: default
|
||||
stringData:
|
||||
API_TOKEN: "replace-me"
|
||||
43
ansible/files/onenav/onenav-proxy.yaml
Normal file
43
ansible/files/onenav/onenav-proxy.yaml
Normal file
@@ -0,0 +1,43 @@
|
||||
# docs/05-02-onenav首页面板.md — 修改 Endpoints IP 与 Ingress host
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: onenav-external
|
||||
namespace: default
|
||||
spec:
|
||||
ports:
|
||||
- name: http
|
||||
port: 80
|
||||
targetPort: 7070
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Endpoints
|
||||
metadata:
|
||||
name: onenav-external
|
||||
namespace: default
|
||||
subsets:
|
||||
- addresses:
|
||||
- ip: 192.168.2.22
|
||||
ports:
|
||||
- port: 7070
|
||||
name: http
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: onenav
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- host: onenav.example.com
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: onenav-external
|
||||
port:
|
||||
number: 80
|
||||
74
ansible/files/openclaw/openclaw-k3s-experimental.yaml
Normal file
74
ansible/files/openclaw/openclaw-k3s-experimental.yaml
Normal file
@@ -0,0 +1,74 @@
|
||||
# docs/05-08-openclaw-k3s-实验部署.md — 实验用;替换镜像与域名
|
||||
apiVersion: v1
|
||||
kind: Namespace
|
||||
metadata:
|
||||
name: openclaw
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: openclaw-gateway
|
||||
namespace: openclaw
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: openclaw-gateway
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: openclaw-gateway
|
||||
spec:
|
||||
containers:
|
||||
- name: openclaw-gateway
|
||||
image: registry.local/openclaw:local
|
||||
imagePullPolicy: IfNotPresent
|
||||
env:
|
||||
- name: OPENCLAW_GATEWAY_MODE
|
||||
value: "local"
|
||||
ports:
|
||||
- containerPort: 18789
|
||||
volumeMounts:
|
||||
- name: config
|
||||
mountPath: /home/node/.openclaw
|
||||
- name: workspace
|
||||
mountPath: /home/node/.openclaw/workspace
|
||||
volumes:
|
||||
- name: config
|
||||
emptyDir: {}
|
||||
- name: workspace
|
||||
emptyDir: {}
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: openclaw-gateway
|
||||
namespace: openclaw
|
||||
spec:
|
||||
selector:
|
||||
app: openclaw-gateway
|
||||
ports:
|
||||
- port: 18789
|
||||
targetPort: 18789
|
||||
protocol: TCP
|
||||
name: http
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: openclaw-gateway
|
||||
namespace: openclaw
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- host: openclaw-k3s.example.com
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: openclaw-gateway
|
||||
port:
|
||||
number: 18789
|
||||
43
ansible/files/openclaw/openclaw-proxy.yaml
Normal file
43
ansible/files/openclaw/openclaw-proxy.yaml
Normal file
@@ -0,0 +1,43 @@
|
||||
# docs/05-07-openclaw应用部署.md — 修改 IP / host
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: openclaw-external
|
||||
namespace: default
|
||||
spec:
|
||||
ports:
|
||||
- name: http
|
||||
port: 80
|
||||
targetPort: 18789
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Endpoints
|
||||
metadata:
|
||||
name: openclaw-external
|
||||
namespace: default
|
||||
subsets:
|
||||
- addresses:
|
||||
- ip: 192.168.2.70
|
||||
ports:
|
||||
- port: 18789
|
||||
name: http
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: openclaw
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: web
|
||||
spec:
|
||||
rules:
|
||||
- host: openclaw.example.com
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: openclaw-external
|
||||
port:
|
||||
number: 80
|
||||
27
ansible/files/openlist/app-data-backup-cronjob.yaml
Normal file
27
ansible/files/openlist/app-data-backup-cronjob.yaml
Normal file
@@ -0,0 +1,27 @@
|
||||
# docs/06-03-k3s-自动备份与恢复-openlist-webdav.md — 替换镜像、hostPath、远端名
|
||||
apiVersion: batch/v1
|
||||
kind: CronJob
|
||||
metadata:
|
||||
name: app-data-backup
|
||||
namespace: default
|
||||
spec:
|
||||
schedule: "0 3 * * *"
|
||||
jobTemplate:
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- name: app-data-backup
|
||||
image: your-registry/app-backup:latest
|
||||
args:
|
||||
- /bin/sh
|
||||
- -c
|
||||
- rclone sync /data openlist-webdav:backups/app-data
|
||||
volumeMounts:
|
||||
- name: app-data
|
||||
mountPath: /data
|
||||
volumes:
|
||||
- name: app-data
|
||||
hostPath:
|
||||
path: /data/app
|
||||
restartPolicy: OnFailure
|
||||
24
ansible/files/openlist/app-data-restore-job.yaml
Normal file
24
ansible/files/openlist/app-data-restore-job.yaml
Normal file
@@ -0,0 +1,24 @@
|
||||
# docs/06-03-k3s-自动备份与恢复-openlist-webdav.md — 一次性恢复 Job
|
||||
apiVersion: batch/v1
|
||||
kind: Job
|
||||
metadata:
|
||||
name: app-data-restore
|
||||
namespace: default
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- name: app-data-restore
|
||||
image: your-registry/app-backup:latest
|
||||
args:
|
||||
- /bin/sh
|
||||
- -c
|
||||
- rclone sync openlist-webdav:backups/app-data /data
|
||||
volumeMounts:
|
||||
- name: app-data
|
||||
mountPath: /data
|
||||
volumes:
|
||||
- name: app-data
|
||||
hostPath:
|
||||
path: /data/app
|
||||
restartPolicy: OnFailure
|
||||
27
ansible/files/openlist/openlist-backup-cronjob.yaml
Normal file
27
ansible/files/openlist/openlist-backup-cronjob.yaml
Normal file
@@ -0,0 +1,27 @@
|
||||
# docs/05-06-openlist挂载网盘与自动备份.md — 替换镜像与 PVC 名
|
||||
apiVersion: batch/v1
|
||||
kind: CronJob
|
||||
metadata:
|
||||
name: openlist-backup
|
||||
namespace: default
|
||||
spec:
|
||||
schedule: "0 3 * * *"
|
||||
jobTemplate:
|
||||
spec:
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- name: openlist-backup
|
||||
image: your-registry/openlist-backup:latest
|
||||
args:
|
||||
- /bin/sh
|
||||
- -c
|
||||
- /backup.sh
|
||||
volumeMounts:
|
||||
- name: backup-target
|
||||
mountPath: /backup
|
||||
volumes:
|
||||
- name: backup-target
|
||||
persistentVolumeClaim:
|
||||
claimName: openlist-backup-pvc
|
||||
restartPolicy: OnFailure
|
||||
38
ansible/files/traefik-acme/traefik-acme.yaml
Normal file
38
ansible/files/traefik-acme/traefik-acme.yaml
Normal file
@@ -0,0 +1,38 @@
|
||||
# 03-02 Traefik ACME 配置(HelmChartConfig)
|
||||
# 含:ACME(Cloudflare DNS-01)、ping 健康检查(websecure)、PROXY protocol trustedIPs
|
||||
# 使用前:替换 <YOUR_REAL_EMAIL>,创建 cloudflare-api-token Secret,按实际修改 nodeSelector/trustedIPs
|
||||
# 部署:kubectl apply -f traefik-acme.yaml(或复制到 K3s manifests 目录)
|
||||
---
|
||||
apiVersion: helm.cattle.io/v1
|
||||
kind: HelmChartConfig
|
||||
metadata:
|
||||
name: traefik
|
||||
namespace: kube-system
|
||||
spec:
|
||||
valuesContent: |-
|
||||
additionalArguments:
|
||||
- "--log.level=INFO"
|
||||
- "--certificatesresolvers.cloudflare.acme.dnschallenge.resolvers=1.1.1.1:53,1.0.0.1:53"
|
||||
- "--certificatesresolvers.cloudflare.acme.email=<YOUR_REAL_EMAIL>"
|
||||
- "--certificatesresolvers.cloudflare.acme.storage=/data/acme.json"
|
||||
# - "--certificatesresolvers.cloudflare.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory" # 测试用,上线前删除
|
||||
- "--certificatesresolvers.cloudflare.acme.dnschallenge.provider=cloudflare"
|
||||
- "--certificatesresolvers.cloudflare.acme.dnschallenge.propagation.delayBeforeChecks=600"
|
||||
|
||||
# 健康检查:GET /ping 在 443(HTTPS) 返回 200,供 HAProxy 对 443 做 option httpchk + ssl
|
||||
- "--ping=true"
|
||||
- "--ping.entryPoint=websecure"
|
||||
|
||||
# PROXY protocol:trustedIPs 需包含 HAProxy 所在 IP/网段
|
||||
- "--entrypoints.web.proxyProtocol.trustedIPs=192.168.2.0/24"
|
||||
- "--entrypoints.websecure.proxyProtocol.trustedIPs=192.168.2.0/24"
|
||||
|
||||
env:
|
||||
- name: CF_DNS_API_TOKEN
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: cloudflare-api-token
|
||||
key: api-token
|
||||
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc61
|
||||
25
ansible/files/traefik-custom-ports/traefik-custom-ports.yaml
Normal file
25
ansible/files/traefik-custom-ports/traefik-custom-ports.yaml
Normal file
@@ -0,0 +1,25 @@
|
||||
---
|
||||
apiVersion: helm.cattle.io/v1
|
||||
kind: HelmChartConfig
|
||||
metadata:
|
||||
name: traefik
|
||||
namespace: kube-system
|
||||
spec:
|
||||
valuesContent: |-
|
||||
ports:
|
||||
web:
|
||||
expose: true
|
||||
websecure:
|
||||
expose: true
|
||||
# 自定义 HTTP 入口(示例 18080)
|
||||
web18080:
|
||||
port: 18080
|
||||
expose:
|
||||
default: true
|
||||
exposedPort: 18080
|
||||
# 自定义 HTTPS 入口(示例 18443)
|
||||
websecure18443:
|
||||
port: 18443
|
||||
expose:
|
||||
default: true
|
||||
exposedPort: 18443
|
||||
60
ansible/files/traefik-dashboard-acme/tomcat-acme-test05.yaml
Normal file
60
ansible/files/traefik-dashboard-acme/tomcat-acme-test05.yaml
Normal file
@@ -0,0 +1,60 @@
|
||||
# docs/03-03 第 5 节:Tomcat + test05.jackadam.top 验证 HTTPS(请按需改域名)
|
||||
---
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: tomcat-test05
|
||||
namespace: default
|
||||
labels:
|
||||
app: tomcat-test05
|
||||
spec:
|
||||
replicas: 1
|
||||
selector:
|
||||
matchLabels:
|
||||
app: tomcat-test05
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: tomcat-test05
|
||||
spec:
|
||||
containers:
|
||||
- name: tomcat
|
||||
image: tomcat:9.0
|
||||
ports:
|
||||
- containerPort: 8080
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: Service
|
||||
metadata:
|
||||
name: tomcat-test05
|
||||
namespace: default
|
||||
spec:
|
||||
selector:
|
||||
app: tomcat-test05
|
||||
ports:
|
||||
- port: 8080
|
||||
targetPort: 8080
|
||||
---
|
||||
apiVersion: networking.k8s.io/v1
|
||||
kind: Ingress
|
||||
metadata:
|
||||
name: tomcat-test05-acme
|
||||
namespace: default
|
||||
annotations:
|
||||
traefik.ingress.kubernetes.io/router.entrypoints: websecure
|
||||
traefik.ingress.kubernetes.io/router.tls.certresolver: cloudflare
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
- test05.jackadam.top
|
||||
rules:
|
||||
- host: test05.jackadam.top
|
||||
http:
|
||||
paths:
|
||||
- path: /
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: tomcat-test05
|
||||
port:
|
||||
number: 8080
|
||||
@@ -0,0 +1,49 @@
|
||||
# 03-03 Traefik Dashboard + ACME 合并配置(HelmChartConfig)
|
||||
# 含:Dashboard、ACME(Cloudflare DNS-01)、ping、PROXY protocol(与 03-02 一致)
|
||||
# 使用前:替换 <YOUR_REAL_EMAIL>,创建 cloudflare-api-token Secret,按实际修改 nodeSelector/trustedIPs
|
||||
# 部署:kubectl apply -f traefik-dashboard-acme.yaml
|
||||
---
|
||||
apiVersion: helm.cattle.io/v1
|
||||
kind: HelmChartConfig
|
||||
metadata:
|
||||
name: traefik
|
||||
namespace: kube-system
|
||||
spec:
|
||||
valuesContent: |-
|
||||
ports:
|
||||
web:
|
||||
expose: true
|
||||
websecure:
|
||||
expose: true
|
||||
|
||||
additionalArguments:
|
||||
- "--api.dashboard=true"
|
||||
- "--api.insecure=true"
|
||||
|
||||
- "--log.level=INFO"
|
||||
- "--certificatesresolvers.cloudflare.acme.dnschallenge.resolvers=1.1.1.1:53,1.0.0.1:53"
|
||||
- "--certificatesresolvers.cloudflare.acme.email=<YOUR_REAL_EMAIL>"
|
||||
- "--certificatesresolvers.cloudflare.acme.storage=/data/acme.json"
|
||||
# - "--certificatesresolvers.cloudflare.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory" # 测试用,上线前删除
|
||||
- "--certificatesresolvers.cloudflare.acme.dnschallenge.provider=cloudflare"
|
||||
- "--certificatesresolvers.cloudflare.acme.dnschallenge.propagation.delayBeforeChecks=600"
|
||||
|
||||
- "--ping=true"
|
||||
- "--ping.entryPoint=websecure"
|
||||
|
||||
- "--entrypoints.web.proxyProtocol.trustedIPs=192.168.2.0/24"
|
||||
- "--entrypoints.websecure.proxyProtocol.trustedIPs=192.168.2.0/24"
|
||||
|
||||
env:
|
||||
- name: CF_DNS_API_TOKEN
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: cloudflare-api-token
|
||||
key: api-token
|
||||
|
||||
nodeSelector:
|
||||
kubernetes.io/hostname: ylc61
|
||||
|
||||
ingressRoute:
|
||||
dashboard:
|
||||
enabled: true
|
||||
37
ansible/files/traefik-dashboard/traefik-dashboard.yaml
Normal file
37
ansible/files/traefik-dashboard/traefik-dashboard.yaml
Normal file
@@ -0,0 +1,37 @@
|
||||
# 03-01 Traefik Dashboard(HelmChartConfig + IngressRoute)
|
||||
# 部署:kubectl apply -f traefik-dashboard.yaml(或复制到 K3s server/manifests/)
|
||||
---
|
||||
apiVersion: helm.cattle.io/v1
|
||||
kind: HelmChartConfig
|
||||
metadata:
|
||||
name: traefik
|
||||
namespace: kube-system
|
||||
spec:
|
||||
valuesContent: |-
|
||||
ports:
|
||||
web:
|
||||
expose: true
|
||||
websecure:
|
||||
expose: true
|
||||
traefik:
|
||||
expose: true
|
||||
|
||||
additionalArguments:
|
||||
- "--api.dashboard=true"
|
||||
- "--api.insecure=true"
|
||||
|
||||
---
|
||||
apiVersion: traefik.io/v1alpha1
|
||||
kind: IngressRoute
|
||||
metadata:
|
||||
name: traefik-dashboard
|
||||
namespace: kube-system
|
||||
spec:
|
||||
entryPoints:
|
||||
- web
|
||||
routes:
|
||||
- match: PathPrefix(`/dashboard`) || PathPrefix(`/api`)
|
||||
kind: Rule
|
||||
services:
|
||||
- name: api@internal
|
||||
kind: TraefikService
|
||||
22
ansible/group_vars/all.yml
Normal file
22
ansible/group_vars/all.yml
Normal file
@@ -0,0 +1,22 @@
|
||||
---
|
||||
# 使用 root SSH 连接(setup-k3s-workers-ssh.sh 已将同一公钥写入各节点 root)
|
||||
ansible_user: root
|
||||
|
||||
timezone: "Asia/Shanghai"
|
||||
|
||||
# k3s 相关
|
||||
k3s_version: "" # 为空表示用 get.k3s.io 默认最新
|
||||
k3s_data_dir: "/storage"
|
||||
k3s_server_ip: "192.168.2.61"
|
||||
|
||||
# 可选:是否管理 /etc/hosts、firewalld 基线
|
||||
k3s_manage_hosts: true
|
||||
k3s_manage_firewalld: true
|
||||
|
||||
#(入口节点选择已改为手动 kubectl 打标,不再通过 Ansible 管理 Traefik 入口标签)
|
||||
|
||||
## 角色标签(自动打标开关)
|
||||
# 若希望 Ansible 在安装后自动为节点打 control-plane / worker 角色标签,用于 02-05 矩阵等场景,
|
||||
# 可开启此开关;默认 true 表示自动按 inventory 中的 k3s_server / k3s_worker 分组打标。
|
||||
# 如需完全手动管理角色标签,可改为 false,并参考 `01-02-k3s-工作节点.md` 中的 kubectl 示例。
|
||||
k3s_manage_role_labels: true
|
||||
14
ansible/inventory.ini
Normal file
14
ansible/inventory.ini
Normal file
@@ -0,0 +1,14 @@
|
||||
[k3s_server]
|
||||
# root SSH 连接,setup-k3s-workers-ssh.sh 会配置所有节点(含控制节点)
|
||||
ylc61 ansible_host=192.168.2.61 ansible_ssh_private_key_file=~/.ssh/id_ed25519_k3s_192.168.2.61
|
||||
|
||||
[k3s_worker]
|
||||
# 使用 setup-k3s-workers-ssh.sh 生成的每节点密钥,路径需与控制机实际一致
|
||||
ylc62 ansible_host=192.168.2.62 ansible_ssh_private_key_file=~/.ssh/id_ed25519_k3s_192.168.2.62
|
||||
ylc63 ansible_host=192.168.2.63 ansible_ssh_private_key_file=~/.ssh/id_ed25519_k3s_192.168.2.63
|
||||
ylc64 ansible_host=192.168.2.64 ansible_ssh_private_key_file=~/.ssh/id_ed25519_k3s_192.168.2.64
|
||||
|
||||
[k3s_nodes:children]
|
||||
k3s_server
|
||||
k3s_worker
|
||||
|
||||
197
ansible/playbooks/k3s-init-and-install.yml
Normal file
197
ansible/playbooks/k3s-init-and-install.yml
Normal file
@@ -0,0 +1,197 @@
|
||||
---
|
||||
- name: Init base system
|
||||
hosts: k3s_nodes
|
||||
become: true
|
||||
tasks:
|
||||
# 检查当前节点上 firewalld 的运行状态,供后续条件判断使用
|
||||
- name: Check if firewalld is running
|
||||
ansible.builtin.command: firewall-cmd --state
|
||||
register: firewalld_state
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
# 根据全局 timezone 变量设置系统时区(可选)
|
||||
- name: Set timezone
|
||||
ansible.builtin.command: timedatectl set-timezone {{ timezone }}
|
||||
when: timezone is defined and timezone != ""
|
||||
|
||||
# 安装 k3s 所需的基础工具包(curl、git 等)
|
||||
- name: Install basic packages
|
||||
ansible.builtin.package:
|
||||
name:
|
||||
- curl
|
||||
- git
|
||||
state: present
|
||||
|
||||
# 确保 /etc/hosts 中包含所有 k3s 节点的主机名解析(可选)
|
||||
- name: Ensure /etc/hosts has entries for all k3s nodes
|
||||
ansible.builtin.lineinfile:
|
||||
path: /etc/hosts
|
||||
regexp: '^\S+\s+{{ item }}\s*$'
|
||||
line: "{{ hostvars[item]['ansible_host'] }} {{ item }}"
|
||||
state: present
|
||||
loop: "{{ groups['k3s_nodes'] }}"
|
||||
when:
|
||||
- k3s_manage_hosts | default(true) | bool
|
||||
- hostvars[item]['ansible_host'] is defined
|
||||
|
||||
# k3s 所需端口:8472/udp(flannel VXLAN)全部节点;6443/tcp(API)仅 server
|
||||
# 必须在安装 k3s 前开放,否则 worker 无法连接、flannel 无法建立 overlay
|
||||
# 在所有 k3s 节点上开放 flannel VXLAN 所需的 8472/udp 端口
|
||||
- name: Open flannel VXLAN port (8472/udp) on all k3s nodes
|
||||
ansible.builtin.command: firewall-cmd --permanent --add-port=8472/udp
|
||||
when:
|
||||
- k3s_manage_firewalld | default(true) | bool
|
||||
- firewalld_state.stdout | default('') == 'running'
|
||||
|
||||
# 在 server 节点上开放 k3s API 端口 6443/tcp
|
||||
- name: Open k3s API port (6443/tcp) on server
|
||||
ansible.builtin.command: firewall-cmd --permanent --add-port=6443/tcp
|
||||
when:
|
||||
- k3s_manage_firewalld | default(true) | bool
|
||||
- inventory_hostname in groups['k3s_server']
|
||||
- firewalld_state.stdout | default('') == 'running'
|
||||
|
||||
# 在完成端口放行后重新加载 firewalld 规则
|
||||
- name: Reload firewalld after opening k3s ports
|
||||
ansible.builtin.command: firewall-cmd --reload
|
||||
when:
|
||||
- k3s_manage_firewalld | default(true) | bool
|
||||
- firewalld_state.stdout | default('') == 'running'
|
||||
|
||||
- name: Install k3s server
|
||||
hosts: k3s_server
|
||||
become: true
|
||||
tasks:
|
||||
# 在 server 节点上下载安装并启动 k3s server 进程
|
||||
- name: Download and install k3s server
|
||||
ansible.builtin.shell: |
|
||||
curl -sfL https://get.k3s.io | INSTALL_K3S_EXEC="server --data-dir={{ k3s_data_dir }}" sh -
|
||||
args:
|
||||
creates: "{{ k3s_data_dir }}/server"
|
||||
|
||||
- name: Install k3s agent (workers)
|
||||
hosts: k3s_worker
|
||||
become: true
|
||||
serial: 1 # 逐台安装,减轻并行下载对网络的压力
|
||||
tasks:
|
||||
# 从首个 server 节点读取集群 token(仅执行一次)
|
||||
- name: Read k3s token from first server
|
||||
ansible.builtin.slurp:
|
||||
src: "{{ k3s_data_dir }}/server/token"
|
||||
delegate_to: "{{ groups['k3s_server'][0] }}"
|
||||
run_once: true
|
||||
register: k3s_token_from_server
|
||||
|
||||
# 在各 worker 节点上保存解码后的 token 供后续安装使用
|
||||
- name: Set fact for k3s token on workers
|
||||
ansible.builtin.set_fact:
|
||||
k3s_token: "{{ k3s_token_from_server.content | b64decode | trim }}"
|
||||
|
||||
# 在每个 worker 节点上下载安装并启动 k3s agent 进程
|
||||
- name: Install k3s agent
|
||||
ansible.builtin.shell: |
|
||||
curl -sfL https://get.k3s.io | K3S_URL=https://{{ k3s_server_ip }}:6443 K3S_TOKEN={{ k3s_token }} INSTALL_K3S_EXEC="agent --data-dir={{ k3s_data_dir }}" sh -
|
||||
args:
|
||||
creates: "{{ k3s_data_dir }}/agent"
|
||||
async: 600
|
||||
poll: 15
|
||||
|
||||
- name: Configure firewalld baseline for k3s (flannel.1 / cni0 -> trusted)
|
||||
hosts: k3s_nodes
|
||||
become: true
|
||||
tasks:
|
||||
# 为 k3s 配置 firewalld 基线:将 flannel.1 / cni0 加入 trusted 区域
|
||||
- block:
|
||||
# 检查节点上 firewalld 是否可用
|
||||
- name: Check if firewalld is available
|
||||
ansible.builtin.command: firewall-cmd --state
|
||||
register: firewalld_check
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
# 等待 CNI 接口 flannel.1 和 cni0 出现(k3s 启动并创建完成)
|
||||
- name: Wait for CNI interfaces (flannel.1, cni0) to appear
|
||||
ansible.builtin.shell: |
|
||||
for i in $(seq 1 120); do
|
||||
ip link show flannel.1 >/dev/null 2>&1 && ip link show cni0 >/dev/null 2>&1 && exit 0
|
||||
sleep 1
|
||||
done
|
||||
exit 1
|
||||
when: firewalld_check.stdout == 'running'
|
||||
|
||||
# 将 flannel.1 / cni0 接口加入 firewalld trusted 区域(运行时和永久)
|
||||
- name: Add flannel.1 and cni0 to firewalld trusted zone (runtime + permanent)
|
||||
ansible.builtin.shell: |
|
||||
firewall-cmd --zone=trusted --add-interface={{ item }}
|
||||
firewall-cmd --permanent --zone=trusted --add-interface={{ item }}
|
||||
loop:
|
||||
- flannel.1
|
||||
- cni0
|
||||
when: firewalld_check.stdout == 'running'
|
||||
|
||||
# 更新 firewalld 配置使新接口规则立即生效
|
||||
- name: Reload firewalld
|
||||
ansible.builtin.command: firewall-cmd --reload
|
||||
when: firewalld_check.stdout == 'running'
|
||||
when: k3s_manage_firewalld | default(true) | bool
|
||||
|
||||
- name: 安装后验证 - traefik / nodes / curl
|
||||
hosts: k3s_server
|
||||
become: true
|
||||
run_once: true
|
||||
vars:
|
||||
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
|
||||
tasks:
|
||||
# 安装后为控制节点打 control-plane 标签(02-05 矩阵 M1 需此标签才能调度),节点名与 inventory 短主机名一致(ylc61~ylc64)
|
||||
- name: Label control-plane nodes (k3s 不默认打标,M1 需此标签)
|
||||
ansible.builtin.shell: |
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl label node {{ item }} node-role.kubernetes.io/control-plane= --overwrite
|
||||
loop: "{{ groups['k3s_server'] | default([]) }}"
|
||||
|
||||
# 可选:为工作节点打 worker 标签(02-05 矩阵 M3 需要)
|
||||
- name: 可选 - 为工作节点打 worker 标签(02-05 矩阵 M3 需要)
|
||||
ansible.builtin.shell: |
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl label node {{ item }} node-role.kubernetes.io/worker= --overwrite
|
||||
loop: "{{ groups['k3s_worker'] | default([]) }}"
|
||||
when: k3s_manage_role_labels | default(true) | bool
|
||||
|
||||
# 查看 kube-system 命名空间中与 Traefik / svclb 相关的 Pod 列表
|
||||
- name: kubectl get pods -n kube-system(traefik / svclb)
|
||||
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl get pods -n kube-system -o wide | grep -E 'NAME|traefik|svclb'
|
||||
register: verify_traefik
|
||||
changed_when: false
|
||||
|
||||
# 打印上一步查询到的 Traefik 相关 Pod 信息
|
||||
- name: ">>> Traefik 相关 Pods"
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ item }}"
|
||||
loop: "{{ verify_traefik.stdout_lines }}"
|
||||
|
||||
# 查询当前集群中的节点列表
|
||||
- name: kubectl get nodes
|
||||
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl get nodes
|
||||
register: verify_nodes
|
||||
changed_when: false
|
||||
|
||||
# 打印节点列表结果,方便确认节点状态与角色
|
||||
- name: ">>> kubectl get nodes"
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ item }}"
|
||||
loop: "{{ verify_nodes.stdout_lines }}"
|
||||
|
||||
# 通过 curl 测试每个节点 80 与 443 入口连通性
|
||||
- name: curl 测试各节点 80/443 可达性
|
||||
ansible.builtin.shell: |
|
||||
for ip in {{ groups['k3s_nodes'] | map('extract', hostvars) | map(attribute='ansible_host') | join(' ') }}; do
|
||||
c80=$(curl -sk -o /dev/null -w "%{http_code}" --connect-timeout 2 http://$ip 2>/dev/null) || c80="fail"
|
||||
c443=$(curl -sk -o /dev/null -w "%{http_code}" --connect-timeout 2 https://$ip 2>/dev/null) || c443="fail"
|
||||
echo "$ip: 80=$c80 443=$c443"
|
||||
done
|
||||
register: verify_curl
|
||||
changed_when: false
|
||||
|
||||
- name: ">>> curl 结果"
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ item }}"
|
||||
loop: "{{ verify_curl.stdout_lines }}"
|
||||
166
ansible/playbooks/nginx-matrix-deploy.yml
Normal file
166
ansible/playbooks/nginx-matrix-deploy.yml
Normal file
@@ -0,0 +1,166 @@
|
||||
---
|
||||
# Ansible 一键部署 nginx 矩阵(M1~M4)
|
||||
# 对应文档:docs/02-05-nginx-验证矩阵-一键部署.md(02-01~02-04 分篇已整合)
|
||||
#
|
||||
# 说明:复制 manifests → kubectl apply → 等待 Pod 就绪 → 验证 Pod 节点分布 → curl 16 目标
|
||||
# manifests:ansible/files/nginx-matrix/,M1 control-plane / M2 ylc61 / M3 worker / M4 ylc64,按实际修改 02/04 hostname
|
||||
#
|
||||
# 执行(在 ansible/ 目录下):
|
||||
# ansible-playbook -i inventory.ini playbooks/nginx-matrix-deploy.yml
|
||||
# 或在仓库根目录:
|
||||
# ansible-playbook -i ansible/inventory.ini ansible/playbooks/nginx-matrix-deploy.yml
|
||||
- name: Deploy nginx matrix (M1~M4)
|
||||
hosts: k3s_server
|
||||
become: true
|
||||
run_once: true
|
||||
vars:
|
||||
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
|
||||
# manifests 在 ansible/files/nginx-matrix/,与 playbook 同项目
|
||||
manifests_path: "{{ playbook_dir }}/../files/nginx-matrix"
|
||||
tasks:
|
||||
- name: Ensure manifests path exists
|
||||
ansible.builtin.stat:
|
||||
path: "{{ manifests_path }}"
|
||||
register: manifests_stat
|
||||
|
||||
- name: Fail if manifests not found
|
||||
ansible.builtin.fail:
|
||||
msg: "manifests 未找到: {{ manifests_path }},请从仓库根目录或 ansible 同级执行"
|
||||
when: not manifests_stat.stat.exists
|
||||
|
||||
# 部署前确保 control-plane/worker 标签存在(M1/M3 需此才能调度),节点名为短主机名(ylc61~ylc64)
|
||||
- name: Ensure control-plane label on k3s_server nodes (for M1)
|
||||
ansible.builtin.shell: |
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl label node {{ item }} node-role.kubernetes.io/control-plane= --overwrite
|
||||
loop: "{{ groups['k3s_server'] | default([]) }}"
|
||||
|
||||
- name: Ensure worker label on k3s_worker nodes (for M3)
|
||||
ansible.builtin.shell: |
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl label node {{ item }} node-role.kubernetes.io/worker= --overwrite
|
||||
loop: "{{ groups['k3s_worker'] | default([]) }}"
|
||||
|
||||
- name: Copy nginx matrix manifests to server
|
||||
ansible.builtin.copy:
|
||||
src: "{{ manifests_path }}/"
|
||||
dest: /tmp/nginx-matrix/
|
||||
mode: '0644'
|
||||
|
||||
# 先删全部 nginx 矩阵 Deployment 再 apply,避免旧 ReplicaSet 导致任一 Mx 仍显示默认页
|
||||
- name: Delete all nginx matrix deployments before apply
|
||||
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl delete deployment nginx-m1 nginx-m2 nginx-m3 nginx-m4 -n default --ignore-not-found=true
|
||||
register: del_nginx
|
||||
changed_when: "'deleted' in del_nginx.stdout"
|
||||
|
||||
- name: kubectl apply nginx matrix
|
||||
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl apply -f /tmp/nginx-matrix/ -R
|
||||
register: k8s_apply
|
||||
changed_when: "'configured' in k8s_apply.stdout or 'created' in k8s_apply.stdout"
|
||||
|
||||
- name: Restart nginx deployments so pods pick up ConfigMap (M1~M4 标识)
|
||||
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl rollout restart deployment nginx-m1 nginx-m2 nginx-m3 nginx-m4 -n default
|
||||
register: restart_out
|
||||
changed_when: true
|
||||
|
||||
- name: Wait for nginx pods to be ready
|
||||
ansible.builtin.shell: |
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl wait --for=condition=ready pod \
|
||||
-l app=nginx-m1 --timeout=60s
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl wait --for=condition=ready pod \
|
||||
-l app=nginx-m2 --timeout=60s
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl wait --for=condition=ready pod \
|
||||
-l app=nginx-m3 --timeout=120s
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl wait --for=condition=ready pod \
|
||||
-l app=nginx-m4 --timeout=120s
|
||||
register: wait_result
|
||||
changed_when: false
|
||||
|
||||
- name: Verify nginx matrix
|
||||
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl get pod,svc,ing,ingressroute -n default -o wide
|
||||
register: verify
|
||||
changed_when: false
|
||||
|
||||
- name: ">>> nginx matrix 资源"
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ item }}"
|
||||
loop: "{{ verify.stdout_lines }}"
|
||||
|
||||
- name: 验证 Pod 节点分布(M1/M2 应在控制节点,M3/M4 应在工作节点)
|
||||
ansible.builtin.shell: |
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl get pod -n default -o custom-columns='NAME:.metadata.name,APP:.metadata.labels.app,NODE:.spec.nodeName' | grep -E '^(NAME|nginx-m)'
|
||||
register: pod_placement
|
||||
changed_when: false
|
||||
|
||||
- name: ">>> Pod 节点分布"
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ item }}"
|
||||
loop: "{{ pod_placement.stdout_lines }}"
|
||||
|
||||
- name: M1 容器内诊断(排查为何仍为 nginx 欢迎页)
|
||||
ansible.builtin.shell: |
|
||||
echo "========== 1. M1 容器内 /usr/share/nginx/html/ 目录 =========="
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl exec -n default deployment/nginx-m1 -- ls -la /usr/share/nginx/html/ 2>/dev/null || echo "(exec 失败)"
|
||||
echo ""
|
||||
echo "========== 2. M1 容器内 index.html 内容(前 5 行)=========="
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl exec -n default deployment/nginx-m1 -- cat /usr/share/nginx/html/index.html 2>/dev/null | head -5 || echo "(exec 失败)"
|
||||
echo ""
|
||||
echo "========== 3. M1 容器内 /etc/nginx/conf.d/ 目录 =========="
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl exec -n default deployment/nginx-m1 -- ls -la /etc/nginx/conf.d/ 2>/dev/null || echo "(exec 失败)"
|
||||
echo ""
|
||||
echo "========== 4. M1 容器内 default.conf 内容 =========="
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl exec -n default deployment/nginx-m1 -- cat /etc/nginx/conf.d/default.conf 2>/dev/null || echo "(exec 失败)"
|
||||
echo ""
|
||||
echo "========== 5. M1 容器内 nginx 生效配置中的 server 块(前 40 行)=========="
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl exec -n default deployment/nginx-m1 -- nginx -T 2>/dev/null | grep -A 200 "server {" | head -40 || echo "(exec 失败)"
|
||||
register: m1_diag
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: ">>> M1 容器内诊断结果(若 M1 仍为欢迎页,请根据此处输出排查)"
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ item }}"
|
||||
loop: "{{ m1_diag.stdout_lines }}"
|
||||
|
||||
- name: 验证 M1~M4 标识(Pod 内 index.html 含 Mx、响应头 X-Backend)
|
||||
ansible.builtin.shell: |
|
||||
base="{{ groups['k3s_nodes'] | map('extract', hostvars) | map(attribute='ansible_host') | first }}"
|
||||
for id in 1 2 3 4; do
|
||||
echo "=== M$id Pod 内 index.html 前 2 行 ==="
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl exec -n default deployment/nginx-m$id -- cat /usr/share/nginx/html/index.html 2>/dev/null | head -2 || echo "(exec 失败)"
|
||||
echo "=== M$id 响应头 X-Backend ==="
|
||||
curl -sI "http://$base/demo-m$id/" 2>/dev/null | grep -i x-backend || echo "(未看到 X-Backend)"
|
||||
echo ""
|
||||
done
|
||||
register: m_check
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: ">>> M1~M4 验证"
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ item }}"
|
||||
loop: "{{ m_check.stdout_lines }}"
|
||||
|
||||
- name: curl 验证(16 个目标:4 节点 × 4 路径)
|
||||
ansible.builtin.shell: |
|
||||
bases="{{ groups['k3s_nodes'] | map('extract', hostvars) | map(attribute='ansible_host') | join(' ') }}"
|
||||
paths="/demo-m1 /demo-m2 /demo-m3 /demo-m4"
|
||||
count=0
|
||||
ok=0
|
||||
echo "=== 16 个目标 (4 节点 × 4 路径) ==="
|
||||
echo "节点 M1(控制+Ingress) M2(控制+IR) M3(工作+Ingress) M4(工作+IR)"
|
||||
for base in $bases; do
|
||||
m1=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 2 http://$base/demo-m1 2>/dev/null) || m1="fail"
|
||||
m2=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 2 http://$base/demo-m2 2>/dev/null) || m2="fail"
|
||||
m3=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 2 http://$base/demo-m3 2>/dev/null) || m3="fail"
|
||||
m4=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 2 http://$base/demo-m4 2>/dev/null) || m4="fail"
|
||||
printf "%-12s %-16s %-11s %-16s %s\n" "$base" "$m1" "$m2" "$m3" "$m4"
|
||||
for c in $m1 $m2 $m3 $m4; do count=$((count+1)); [ "$c" = "200" ] && ok=$((ok+1)); done
|
||||
done
|
||||
echo "---"
|
||||
echo "共验证 $count 个目标,$ok 个返回 200"
|
||||
register: curl_result
|
||||
changed_when: false
|
||||
|
||||
- name: ">>> curl 矩阵"
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ item }}"
|
||||
loop: "{{ curl_result.stdout_lines }}"
|
||||
188
ansible/playbooks/nginx-matrix-tls-deploy.yml
Normal file
188
ansible/playbooks/nginx-matrix-tls-deploy.yml
Normal file
@@ -0,0 +1,188 @@
|
||||
---
|
||||
# Ansible 一键部署 nginx 矩阵 TLS 版(M1~M4,HTTPS)
|
||||
# 对应文档:docs/03-02-k3s-traefik-acme.md
|
||||
#
|
||||
# 说明:复制 TLS + HTTP-only manifests → 自动删除已存在的不含 TLS 的 nginx 矩阵(02-05)→ kubectl apply(含 TLS 与 HTTP-only 共 8 个路由)→ 等待 Pod 就绪 → HTTP-only / HTTPS curl 矩阵验证(test01~test04.jackadam.top)
|
||||
# manifests:ansible/files/nginx-matrix-tls/,域名为 test01~test04.jackadam.top,M2/M4 hostname 按实际修改;Ingress/IngressRoute 中 TLS 路由仅绑定 websecure,HTTP-only 路由仅绑定 web
|
||||
# 前置:已按 03-02 配置 ACME(Secret + traefik-acme.yaml),且 test01~test04.jackadam.top 已解析到入口 IP
|
||||
#
|
||||
# 执行(在 ansible/ 目录下):
|
||||
# ansible-playbook -i inventory.ini playbooks/nginx-matrix-tls-deploy.yml
|
||||
# 或在仓库根目录:
|
||||
# ansible-playbook -i ansible/inventory.ini ansible/playbooks/nginx-matrix-tls-deploy.yml
|
||||
# 验证时对所有 k3s_nodes 做 HTTPS 请求(所有节点均为入口点,与 02-05 HTTP 矩阵一致)
|
||||
- name: Deploy or cleanup nginx matrix TLS (M1~M4, HTTPS)
|
||||
hosts: k3s_server
|
||||
become: true
|
||||
run_once: true
|
||||
vars:
|
||||
# mode 由 -e mode=cleanup 传入,未传时默认为 deploy(勿在 vars 中写 mode: "{{ mode | default('deploy') }}" 会递归)
|
||||
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
|
||||
manifests_path: "{{ playbook_dir }}/../files/nginx-matrix-tls"
|
||||
tls_domains:
|
||||
- test01.jackadam.top
|
||||
- test02.jackadam.top
|
||||
- test03.jackadam.top
|
||||
- test04.jackadam.top
|
||||
tasks:
|
||||
- name: Deploy nginx matrix TLS (mode=deploy)
|
||||
when: (mode | default('deploy')) == 'deploy'
|
||||
block:
|
||||
- name: Ensure manifests path exists
|
||||
ansible.builtin.stat:
|
||||
path: "{{ manifests_path }}"
|
||||
register: manifests_stat
|
||||
|
||||
- name: Fail if manifests not found
|
||||
ansible.builtin.fail:
|
||||
msg: "manifests 未找到: {{ manifests_path }},请从仓库根目录或 ansible 同级执行"
|
||||
when: not manifests_stat.stat.exists
|
||||
|
||||
# 部署前确保 control-plane/worker 标签存在(M1/M3 需此才能调度),节点名为短主机名(ylc61~ylc64)
|
||||
- name: Ensure control-plane label on k3s_server nodes (for M1)
|
||||
ansible.builtin.shell: |
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl label node {{ item }} node-role.kubernetes.io/control-plane= --overwrite
|
||||
loop: "{{ groups['k3s_server'] | default([]) }}"
|
||||
|
||||
- name: Ensure worker label on k3s_worker nodes (for M3)
|
||||
ansible.builtin.shell: |
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl label node {{ item }} node-role.kubernetes.io/worker= --overwrite
|
||||
loop: "{{ groups['k3s_worker'] | default([]) }}"
|
||||
|
||||
- name: Copy nginx matrix TLS manifests to server
|
||||
ansible.builtin.copy:
|
||||
src: "{{ manifests_path }}/"
|
||||
dest: /tmp/nginx-matrix-tls/
|
||||
mode: '0644'
|
||||
|
||||
# 若存在不含 TLS 的 nginx 矩阵(02-05),先删掉,避免与 TLS 版 Ingress 冲突或残留
|
||||
- name: Delete non-TLS nginx matrix if present (deployments, ingress, ingressroute, middleware, configmaps)
|
||||
ansible.builtin.shell: |
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl delete deployment,svc -n default nginx-m1 nginx-m2 nginx-m3 nginx-m4 --ignore-not-found=true
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl delete ingress -n default nginx-m1 nginx-m3 --ignore-not-found=true
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl delete ingressroute -n default nginx-m2 nginx-m4 --ignore-not-found=true
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl delete middleware -n default stripprefix-m1 stripprefix-m2 stripprefix-m3 stripprefix-m4 --ignore-not-found=true
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl delete configmap -n default nginx-m1-html nginx-m2-html nginx-m3-html nginx-m4-html --ignore-not-found=true
|
||||
register: del_non_tls
|
||||
changed_when: "'deleted' in del_non_tls.stdout"
|
||||
|
||||
- name: kubectl apply nginx matrix TLS + HTTP-only
|
||||
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl apply -f /tmp/nginx-matrix-tls/ -R
|
||||
register: k8s_apply
|
||||
changed_when: "'configured' in k8s_apply.stdout or 'created' in k8s_apply.stdout"
|
||||
|
||||
- name: Restart nginx deployments so pods pick up ConfigMap (M1~M4 标识)
|
||||
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl rollout restart deployment nginx-m1 nginx-m2 nginx-m3 nginx-m4 -n default
|
||||
register: restart_out
|
||||
changed_when: true
|
||||
|
||||
- name: Wait for nginx pods to be ready
|
||||
ansible.builtin.shell: |
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl wait --for=condition=ready pod \
|
||||
-l app=nginx-m1 --timeout=60s
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl wait --for=condition=ready pod \
|
||||
-l app=nginx-m2 --timeout=60s
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl wait --for=condition=ready pod \
|
||||
-l app=nginx-m3 --timeout=120s
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl wait --for=condition=ready pod \
|
||||
-l app=nginx-m4 --timeout=120s
|
||||
register: wait_result
|
||||
changed_when: false
|
||||
|
||||
- name: Verify nginx matrix TLS resources
|
||||
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl get pod,svc,ing,ingressroute -n default -o wide
|
||||
register: verify
|
||||
changed_when: false
|
||||
|
||||
- name: ">>> nginx matrix TLS 资源"
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ item }}"
|
||||
loop: "{{ verify.stdout_lines }}"
|
||||
|
||||
- name: 验证 M1~M4 标识(Pod 内 index.html 含 Mx、响应头 X-Backend,取首个入口节点)
|
||||
ansible.builtin.shell: |
|
||||
first_ip="{{ groups['k3s_nodes'] | map('extract', hostvars) | map(attribute='ansible_host') | first }}"
|
||||
for id in 1 2 3 4; do
|
||||
echo "=== M$id Pod 内 index.html 前 2 行 ==="
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl exec -n default deployment/nginx-m$id -- cat /usr/share/nginx/html/index.html 2>/dev/null | head -2 || echo "(exec 失败)"
|
||||
echo "=== M$id 响应头 X-Backend (入口 $first_ip) ==="
|
||||
curl -sI "https://test0$id.jackadam.top/" --resolve "test0$id.jackadam.top:443:$first_ip" -k 2>/dev/null | grep -i x-backend || echo "(未看到 X-Backend)"
|
||||
echo ""
|
||||
done
|
||||
register: m_check
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: ">>> M1~M4 验证"
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ item }}"
|
||||
loop: "{{ m_check.stdout_lines }}"
|
||||
|
||||
- name: HTTP curl 验证(HTTP-only:16 个目标,所有节点 × 4 域名)
|
||||
ansible.builtin.shell: |
|
||||
bases="{{ groups['k3s_nodes'] | map('extract', hostvars) | map(attribute='ansible_host') | join(' ') }}"
|
||||
count=0
|
||||
ok=0
|
||||
echo "=== 16 个目标 (4 节点 × 4 域名) HTTP ==="
|
||||
echo "节点 M1(test01) M2(test02) M3(test03) M4(test04)"
|
||||
for base in $bases; do
|
||||
m1=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 http://test01.jackadam.top/ --resolve "test01.jackadam.top:80:$base" 2>/dev/null) || m1="fail"
|
||||
m2=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 http://test02.jackadam.top/ --resolve "test02.jackadam.top:80:$base" 2>/dev/null) || m2="fail"
|
||||
m3=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 http://test03.jackadam.top/ --resolve "test03.jackadam.top:80:$base" 2>/dev/null) || m3="fail"
|
||||
m4=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 http://test04.jackadam.top/ --resolve "test04.jackadam.top:80:$base" 2>/dev/null) || m4="fail"
|
||||
printf "%-12s %-14s %-14s %-14s %s\n" "$base" "$m1" "$m2" "$m3" "$m4"
|
||||
for c in $m1 $m2 $m3 $m4; do count=$((count+1)); [ "$c" = "200" ] && ok=$((ok+1)); done
|
||||
done
|
||||
echo "---"
|
||||
echo "共验证 $count 个目标,$ok 个返回 200"
|
||||
register: curl_http_result
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: ">>> HTTP curl 矩阵(HTTP-only)"
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ item }}"
|
||||
loop: "{{ curl_http_result.stdout_lines }}"
|
||||
|
||||
- name: HTTPS curl 验证(16 个目标:所有节点 × 4 域名,所有节点均为入口点)
|
||||
ansible.builtin.shell: |
|
||||
bases="{{ groups['k3s_nodes'] | map('extract', hostvars) | map(attribute='ansible_host') | join(' ') }}"
|
||||
count=0
|
||||
ok=0
|
||||
echo "=== 16 个目标 (4 节点 × 4 域名) HTTPS ==="
|
||||
echo "节点 M1(test01) M2(test02) M3(test03) M4(test04)"
|
||||
for base in $bases; do
|
||||
m1=$(curl -sk -o /dev/null -w "%{http_code}" --connect-timeout 5 https://test01.jackadam.top/ --resolve "test01.jackadam.top:443:$base" 2>/dev/null) || m1="fail"
|
||||
m2=$(curl -sk -o /dev/null -w "%{http_code}" --connect-timeout 5 https://test02.jackadam.top/ --resolve "test02.jackadam.top:443:$base" 2>/dev/null) || m2="fail"
|
||||
m3=$(curl -sk -o /dev/null -w "%{http_code}" --connect-timeout 5 https://test03.jackadam.top/ --resolve "test03.jackadam.top:443:$base" 2>/dev/null) || m3="fail"
|
||||
m4=$(curl -sk -o /dev/null -w "%{http_code}" --connect-timeout 5 https://test04.jackadam.top/ --resolve "test04.jackadam.top:443:$base" 2>/dev/null) || m4="fail"
|
||||
printf "%-12s %-14s %-14s %-14s %s\n" "$base" "$m1" "$m2" "$m3" "$m4"
|
||||
for c in $m1 $m2 $m3 $m4; do count=$((count+1)); [ "$c" = "200" ] && ok=$((ok+1)); done
|
||||
done
|
||||
echo "---"
|
||||
echo "共验证 $count 个目标,$ok 个返回 200"
|
||||
register: curl_result
|
||||
changed_when: false
|
||||
failed_when: false
|
||||
|
||||
- name: ">>> HTTPS curl 矩阵"
|
||||
ansible.builtin.debug:
|
||||
msg: "{{ item }}"
|
||||
loop: "{{ curl_result.stdout_lines }}"
|
||||
|
||||
- name: Cleanup nginx matrix TLS (mode=cleanup)
|
||||
when: (mode | default('deploy')) == 'cleanup'
|
||||
block:
|
||||
- name: Delete nginx matrix TLS + HTTP-only resources (deployments, ingress, ingressroute, configmaps)
|
||||
ansible.builtin.shell: |
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl delete deployment,svc -n default nginx-m1 nginx-m2 nginx-m3 nginx-m4 --ignore-not-found=true
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl delete ingress -n default nginx-m1 nginx-m3 nginx-m1-http nginx-m3-http --ignore-not-found=true
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl delete ingressroute -n default nginx-m2 nginx-m4 nginx-m2-http nginx-m4-http --ignore-not-found=true
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl delete configmap -n default nginx-m1-html nginx-m2-html nginx-m3-html nginx-m4-html --ignore-not-found=true
|
||||
register: del_tls
|
||||
changed_when: "'deleted' in del_tls.stdout"
|
||||
|
||||
- name: Remove copied nginx matrix TLS manifests directory
|
||||
ansible.builtin.file:
|
||||
path: /tmp/nginx-matrix-tls
|
||||
state: absent
|
||||
47
ansible/playbooks/nodejs-demo-apply.yml
Normal file
47
ansible/playbooks/nodejs-demo-apply.yml
Normal file
@@ -0,0 +1,47 @@
|
||||
---
|
||||
# 一键应用 Node.js demo 清单(与 docs/04-01~04-13 + ansible/files/nodejs-demo 对齐)
|
||||
#
|
||||
# 执行(在仓库根目录):
|
||||
# ansible-playbook -i ansible/inventory.ini ansible/playbooks/nodejs-demo-apply.yml \
|
||||
# -e nodejs_demo_manifest=04-01-nodejs-demo.yaml
|
||||
#
|
||||
# 默认清单:04-01-nodejs-demo.yaml
|
||||
- name: Apply nodejs-demo Kubernetes manifests
|
||||
hosts: k3s_server
|
||||
become: true
|
||||
run_once: true
|
||||
vars:
|
||||
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
|
||||
nodejs_demo_manifest: "04-01-nodejs-demo.yaml"
|
||||
manifests_dir: "{{ playbook_dir }}/../files/nodejs-demo"
|
||||
tasks:
|
||||
- name: Ensure manifest file exists
|
||||
ansible.builtin.stat:
|
||||
path: "{{ manifests_dir }}/{{ nodejs_demo_manifest }}"
|
||||
register: nodejs_manifest_stat
|
||||
delegate_to: localhost
|
||||
become: false
|
||||
|
||||
- name: Fail if manifest not found
|
||||
ansible.builtin.fail:
|
||||
msg: "未找到 {{ manifests_dir }}/{{ nodejs_demo_manifest }},请从仓库根检查文件名"
|
||||
when: not nodejs_manifest_stat.stat.exists
|
||||
delegate_to: localhost
|
||||
become: false
|
||||
|
||||
- name: Copy manifest to control plane
|
||||
ansible.builtin.copy:
|
||||
src: "{{ manifests_dir }}/{{ nodejs_demo_manifest }}"
|
||||
dest: "/tmp/{{ nodejs_demo_manifest }}"
|
||||
mode: "0644"
|
||||
|
||||
- name: kubectl apply nodejs-demo manifest
|
||||
ansible.builtin.shell: |
|
||||
set -e
|
||||
KUBECONFIG={{ k3s_kubeconfig }} kubectl apply -f /tmp/{{ nodejs_demo_manifest }}
|
||||
register: nodejs_apply
|
||||
changed_when: "'configured' in nodejs_apply.stdout or 'created' in nodejs_apply.stdout"
|
||||
|
||||
- name: Show kubectl apply output
|
||||
ansible.builtin.debug:
|
||||
var: nodejs_apply.stdout_lines
|
||||
127
docs/00-00-构建总览.md
Normal file
127
docs/00-00-构建总览.md
Normal file
@@ -0,0 +1,127 @@
|
||||
# 00-00-构建总览
|
||||
|
||||
> 本仓库主文档入口。建议从这里开始阅读。
|
||||
|
||||
## 目录约定
|
||||
|
||||
- 文档:`docs/`(Kubernetes 等可复用清单见 `ansible/files/`,与 Ansible playbook 共用)
|
||||
- 脚本:`scripts/`
|
||||
- 脚本入口:`scripts/README.md`
|
||||
|
||||
### 编号含义速查
|
||||
|
||||
- `00-**`:总览与基础概念(入口、索引、验证矩阵、部署环境说明、未来规划)
|
||||
- `01-**`:安装与基础环境(控制节点/工作节点/OpenWrt HAProxy/Cloudflare/NFS 等)
|
||||
- `02-**`:Nginx 矩阵**分篇场景说明**(M1~M4 各场景独立页;**综合一键部署见 `02-05`**)
|
||||
- `03-**`:集群侧配置扩展;**03-04~03-10 按推荐阅读顺序编号**(Traefik 自定义端口 → Tunnel → local-path → NFS → Longhorn → HA → GitOps)
|
||||
- `04-**`:Node.js 高级部署(`04-01` 主入口 + `04-02`~`04-14` 部署分项;与 nginx 矩阵编号无强制对应)
|
||||
- `05-**`:常用应用部署(Homer、OneNav、GitLab、监控、openlist 等)
|
||||
- `06-**`:排障与运维总结(NetworkPolicy 排查、运维小结)
|
||||
|
||||
---
|
||||
|
||||
## 推荐安装顺序
|
||||
|
||||
1. `00-01-k3s-基础概念.md`
|
||||
2. `01-01-k3s-控制节点含traefik.md`(或直接用 `01-07-节点初始化-ansible-实践.md` 一键自动化)
|
||||
3. `01-02-k3s-工作节点.md`
|
||||
4. `01-03-armv7-standalone-docker.md`
|
||||
5. `01-04-cloudflare-tunnel.md`
|
||||
6. `01-08-openwrt-haproxy.md`(按需:网关负载均衡)
|
||||
7. `04-03-k3s-nginx-demo.md`
|
||||
8. `04-01-k3s-nodejs-高级部署.md`
|
||||
9. `04-02-nodejs-镜像与运行命令.md`
|
||||
10. `04-03-nodejs-环境变量与配置注入.md`
|
||||
11. `04-04-nodejs-端口与Service.md`
|
||||
12. `04-05-nodejs-资源请求与限制.md`
|
||||
13. `04-06-nodejs-探针与健康检查.md`
|
||||
14. `04-07-nodejs-调度与亲和.md`
|
||||
15. `04-08-nodejs-安全上下文.md`
|
||||
16. `04-09-nodejs-存储与卷.md`
|
||||
17. `04-10-nodejs-Ingress与Traefik.md`
|
||||
18. `04-11-nodejs-副本与滚动发布.md`
|
||||
19. `04-12-nodejs-TLS与证书.md`
|
||||
20. `04-13-nodejs-HPA.md`
|
||||
21. `04-14-nodejs-GitOps与CI流水线.md`
|
||||
22. `02-05-nginx-验证矩阵-一键部署.md`(建议先读 `02-00-nginx-系列说明.md`)
|
||||
23. `03-01-k3s-traefik-dashboard.md`
|
||||
24. `03-02-k3s-traefik-acme.md`
|
||||
25. `03-03-k3s-traefik-dashboard-acme.md`(推荐顺序:先 03-01、03-02)
|
||||
26. `03-04-k3s-cloudflare-tunnel-配置接入.md`(按需:Cloudflare Tunnel 接入集群)
|
||||
27. `03-05-k3s-local-path-pvc.md`(K3s 自带 local-path,单副本本地持久化)
|
||||
28. `03-06-k3s-使用nfs存储.md`(按需:已有 NFS 时 PV/PVC)
|
||||
29. `03-07-k3s-longhorn-持久化存储.md`(重状态、快照/备份,建议部署 GitLab 等前统一规划)
|
||||
30. `03-08-k3s-ha-集群配置与切换.md`(按需:双控制节点 HA,配合 `01-05`)
|
||||
31. `03-09-k3s-gitops-集群配置管理.md`(框架草案:Argo CD / Flux)
|
||||
|
||||
> 想确认这些步骤是否已经在真实环境验证,请查看 `00-02-验证矩阵.md`。
|
||||
> 本仓库验证环境说明见 `00-04-部署环境说明.md`。
|
||||
|
||||
---
|
||||
|
||||
## 主线导航
|
||||
|
||||
- `01-02-k3s-工作节点.md`
|
||||
- `03-01-k3s-traefik-dashboard.md`
|
||||
- `04-03-k3s-nginx-demo.md`
|
||||
- `04-01-k3s-nodejs-高级部署.md`(文末:`04-02`~`04-14` Node.js 部署分项)
|
||||
- `03-02-k3s-traefik-acme.md`
|
||||
- `03-04-k3s-cloudflare-tunnel-配置接入.md`
|
||||
- `03-05-k3s-local-path-pvc.md`
|
||||
- `03-06-k3s-使用nfs存储.md`
|
||||
- `03-07-k3s-longhorn-持久化存储.md`
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
---
|
||||
|
||||
## 矩阵导航
|
||||
|
||||
矩阵文档(M1~M8):**02-01~02-04 为 M1~M4 分篇说明**,**02-05 为综合一键部署**。实际部署用 02-05 或 Ansible。
|
||||
|
||||
- `02-00-nginx-系列说明.md`
|
||||
- `02-01-nginx-control-ingress.md`(M1)
|
||||
- `02-02-nginx-control-ingressroute.md`(M2)
|
||||
- `02-03-nginx-worker-ingress.md`(M3)
|
||||
- `02-04-nginx-worker-ingressroute.md`(M4)
|
||||
- `02-05-nginx-验证矩阵-一键部署.md`(HTTP-only 综合部署)
|
||||
|
||||
> **说明**:若曾规划「Node.js 与 nginx 对照」的 M1~M4 矩阵独立文档,**尚未在本仓库落盘**;当前 **`04-05`~`04-08` 已用于** Node.js **部署分项**(资源/探针/调度/安全)。后续若补充 Node.js 矩阵,请**另起编号**(例如 `04-20` 起或归入专题),避免与现有 `04-**` 冲突。
|
||||
|
||||
---
|
||||
|
||||
## 专题导航
|
||||
|
||||
- `00-04-部署环境说明.md`(节点布局、IP、OS、K3s 版本等,便于对照与复现)
|
||||
- `01-07-节点初始化-ansible-实践.md`(Ansible 一键安装 k3s 集群,已验证)
|
||||
- `01-08-openwrt-haproxy.md`(按需:网关负载均衡)
|
||||
- nginx 矩阵:`ansible/playbooks/nginx-matrix-deploy.yml`(02-05)、`ansible/playbooks/nginx-matrix-tls-deploy.yml`(03-02)
|
||||
- `01-04-cloudflare-tunnel.md`(安装准备)
|
||||
- `03-04-k3s-cloudflare-tunnel-配置接入.md`(集群接入)
|
||||
|
||||
- `05-03-k3s-安装gitlab-含runner.md`
|
||||
- `05-04-k3s-配置gitlab-cicd.md`
|
||||
- `05-01-k3s-部署homer首页面板.md`
|
||||
- `05-02-onenav首页面板.md`
|
||||
- `05-05-prometheus与grafana.md`
|
||||
- `05-07-openclaw应用部署.md`
|
||||
- `05-08-openclaw-k3s-实验部署.md`
|
||||
- `01-06-armv7-nfs服务安装.md`
|
||||
- `05-06-openlist挂载网盘与自动备份.md`
|
||||
- `06-02-运维小结.md`
|
||||
- `01-05-双控制节点ha.md`
|
||||
- `03-08-k3s-ha-集群配置与切换.md`
|
||||
- `03-09-k3s-gitops-集群配置管理.md`(框架草案)
|
||||
|
||||
---
|
||||
|
||||
## 排障入口
|
||||
|
||||
- 文档:`06-01-k3s-networkpolicy-故障排查.md`
|
||||
- 脚本:`scripts/README.md`
|
||||
|
||||
---
|
||||
|
||||
## 未来规划
|
||||
|
||||
- `00-03-未来规划与待补功能.md`:记录还没做、但已经想到的能力清单与路线图,方便以后按需补齐。
|
||||
|
||||
139
docs/00-01-k3s-基础概念.md
Normal file
139
docs/00-01-k3s-基础概念.md
Normal file
@@ -0,0 +1,139 @@
|
||||
# 00-01-k3s-基础概念
|
||||
|
||||
> 入门速查:先把核心概念看明白,再去做安装与排障。
|
||||
|
||||
## 阅读建议
|
||||
|
||||
- 新手按本页顺序读完即可
|
||||
- 遇到术语不懂,先回这里再继续操作文档
|
||||
|
||||
## 1. K3s 是什么
|
||||
|
||||
- 轻量 Kubernetes 发行版,适合 Homelab。
|
||||
- **用途**:在自家机器上跑容器、做编排,本仓库用它搭家庭实验环境。你的环境中:`ylc61` 是 server,`ylc62` 是 worker。
|
||||
|
||||
## 2. Service 类型(你最常用)
|
||||
|
||||
### 2.1 ClusterIP
|
||||
|
||||
- 只在集群内可访问,不对外暴露端口。
|
||||
- **用途**:服务之间互相访问用(例如 Traefik 转给后端 Pod 就走 ClusterIP),适合内部调用。
|
||||
|
||||
### 2.2 NodePort
|
||||
|
||||
- 每个节点开放一个高位端口(如 3xxxx),集群外通过「节点 IP:端口」访问。
|
||||
- **用途**:临时从外网访问某个服务、排查问题时用;长期对外入口一般用 LoadBalancer + Ingress,不直接暴露 NodePort。
|
||||
|
||||
### 2.3 LoadBalancer(K3s ServiceLB)
|
||||
|
||||
创建 Service 类型为 LoadBalancer 时,K3s 会为该服务分配对外 IP(多为节点 IP),使集群外可通过该 IP 访问服务,相当于**负载均衡入口**。K3s 用内置的 ServiceLB 实现:在部分节点上起 `svclb-*` Pod,在这些节点上监听并转发到后端。通过节点标签 **`svccontroller.k3s.cattle.io/enablelb=true`** 控制哪些节点参与负载均衡(即哪些节点会跑 svclb、对外暴露端口);**`lbpool`** 可进一步分组(如 `lbpool=edge`),便于多组入口节点。未打 `enablelb` 的节点不会承载该 Service 的入口流量。
|
||||
|
||||
**用途**:让外网通过一个入口(或几台节点上的同一端口)访问你的服务,Traefik 的 80/443 就是这样暴露出来的。
|
||||
|
||||
## 3. 节点标签(nodeSelector 常用)
|
||||
|
||||
### 3.1 hostname(指定单台节点)
|
||||
|
||||
**标签**:`kubernetes.io/hostname`,值为节点名(短主机名,如 `ylc61`,便于配合 Cloudflare CDN)。
|
||||
|
||||
**获取**:NAME 列即节点名;或使用下面命令列出所有节点名:
|
||||
|
||||
```bash
|
||||
kubectl get nodes
|
||||
# 或只输出节点名
|
||||
kubectl get nodes -o jsonpath='{.items[*].metadata.name}'
|
||||
```
|
||||
|
||||
**用途**:在 Deployment 的 `nodeSelector` 里写 `kubernetes.io/hostname: <节点名>`,可指定 Pod 只跑在某一条节点上(如 02-05 的 M2、M4)。
|
||||
|
||||
### 3.2 control-plane / worker 标签(随机控制节点、随机工作节点)
|
||||
|
||||
**标签**:`node-role.kubernetes.io/control-plane`(控制平面)、`node-role.kubernetes.io/worker`(工作节点)。K3s 常不默认打这两个标签。
|
||||
|
||||
**获取**:查 control-plane 标签是否存在、取值是什么。CONTROL 列为 `<none>` 表示该节点没有此标签;工作节点同理可用 `--show-labels` 或把路径中的 `control-plane` 改为 `worker`。
|
||||
|
||||
```bash
|
||||
kubectl get nodes -o custom-columns=NAME:.metadata.name,CONTROL:.metadata.labels.node-role\.kubernetes\.io/control-plane
|
||||
```
|
||||
|
||||
**用途**:nodeSelector 用 `control-plane: ""` 或 `"true"` 可让 Pod 随机落在某一台控制节点(02-05 M1);用 `worker: ""` 可随机落在某一台工作节点(02-05 M3)。未打标时需先打标或改用 hostname。
|
||||
|
||||
### 3.3 ROLES 列(一眼区分控制节点 / 工作节点)
|
||||
|
||||
**标签**:这里指 `kubectl get nodes` 的 ROLES 列(非节点上的 label,而是界面显示),取值为 `control-plane` 或空(终端显示 `<none>`)。
|
||||
|
||||
**获取**:执行下面命令即可看到 ROLES 列:
|
||||
|
||||
```bash
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
**用途**:快速区分哪台是 server、哪几台是 worker。注意:ROLES 只是显示,K3s 不一定同时写入 `node-role.kubernetes.io/control-plane` 标签,用 3.2 的命令查该标签可能为空,此时 nodeSelector 需手动打标或改用 hostname。
|
||||
|
||||
## 4. Ingress 与 IngressRoute
|
||||
|
||||
- **Ingress**:Kubernetes 标准资源,按域名和路径把请求转到不同后端 Service。
|
||||
- **IngressRoute**:Traefik 自带的 CRD,能表达的规则更多(如中间件、多后端权重)。
|
||||
- **用途**:在同一个 80/443 端口上,按「哪个域名、哪个路径」把流量分到不同应用,不用每个服务都单独占一个端口或 IP。
|
||||
|
||||
## 5. NetworkPolicy 基本原则
|
||||
|
||||
- 一旦 Pod 被策略选中,该方向默认拒绝,只放行你显式允许的。
|
||||
- Traefik 到后端建议同时放行:目标命名空间后端端口、Service CIDR(例如 `10.43.0.0/16`)。
|
||||
- **用途**:限制「谁可以访问谁、哪些端口」,做网络隔离、缩小攻击面;配错了会导致访问不通,排障时需一起看。
|
||||
|
||||
参考:`06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 6. 常见误解
|
||||
|
||||
1. `404` 不等于故障(通常是入口通、路由未命中)
|
||||
2. Pod IP 直连失败不一定影响主链路
|
||||
3. 只放行端口不配转发策略,跨节点常会出问题
|
||||
|
||||
**用途**:排障时少走弯路,先分清「是没连上」还是「连上了但路由/策略不对」。
|
||||
|
||||
## 7. kubectl
|
||||
|
||||
- **是什么**:和集群 API 通信的命令行工具,本仓库里的 `kubectl get/apply/delete/logs` 等都用它。
|
||||
- **用途**:查节点和 Pod 状态、部署/删除应用、看日志、排障,日常管理集群基本都靠它。
|
||||
- **在哪里执行**:默认应在控制节点(server)上执行;K3s 装 server 时会在该节点生成 kubeconfig(如 `/etc/rancher/k3s/k3s.yaml`),只有控制节点默认能访问 API。工作节点若没拷 kubeconfig,不能直接在本机执行 `kubectl`;要在本机或其它机器执行时,需从控制节点拷贝 kubeconfig 并设好 `KUBECONFIG`,且该机器能访问控制节点 6443 端口。
|
||||
|
||||
## 8. 存储与节点故障
|
||||
|
||||
### 8.1 K3s 默认持久卷(local-path-provisioner)
|
||||
|
||||
K3s 自带 **local-path-provisioner**:当你创建 PVC 且不指定 `storageClassName` 时,由它按需创建本地 PersistentVolume。
|
||||
|
||||
- **工作机制**:PVC 被创建后,provisioner 会在 **Pod 被调度到的节点** 上,在其本地磁盘创建目录(默认在 `data-dir` 下的 `storage`,例如 `/var/lib/rancher/k3s/storage` 或 `/storage`),并为之创建 PV、与 PVC 绑定。
|
||||
- **绑定到节点**:数据只存在于该节点的本地目录,**与该节点绑定**;Pod 被调度到另一节点时,会拿到新的空卷,旧节点上的数据不会自动迁移。
|
||||
- **适用场景**:单副本应用、缓存、日志等,能接受 Pod 漂移后数据丢失或需手动恢复。**多副本共享数据**应使用 NFS、CSI 等共享存储(见 `01-06`)。
|
||||
- **查看**:`kubectl get storageclass` 可见 `local-path`(通常为默认);`kubectl get pv,pvc` 可查看已创建的卷。
|
||||
- **操作示例**:见 `03-05-k3s-local-path-pvc.md`。
|
||||
|
||||
### 8.2 hostPath 与 NFS(选读)
|
||||
|
||||
- **Pod 可以漂移,宿主机本地数据不会跟着漂移**:用 `hostPath` 把宿主机目录挂进容器时,数据只在这台机器上;Pod 被调度到另一台节点后,那台机器没有同样目录和数据,应用就会“丢数据”。
|
||||
- **K3s 不会自动帮你搬本地数据**:调度器只管 Pod 放哪台节点,不会同步 `/var/lib/...` 或自建目录;所以“节点故障自动漂移”和“数据高可用”是两件事,要分别设计。
|
||||
- **常见做法**:重要数据用共享存储(NFS / 云盘 / CSI),通过 PV/PVC 给 Pod 用(参考 `01-06`、`03-07`);缓存、临时文件用本地目录(`emptyDir` 或 `hostPath`),接受节点挂了可丢;或靠备份/同步把本地目录定期同步到别处,再在新节点恢复。
|
||||
|
||||
**用途**:搞清楚数据放哪、节点挂了会不会丢,才能设计备份和高可用,不踩坑。
|
||||
|
||||
## 9. 删除部署
|
||||
|
||||
- **是什么**:通过 `kubectl delete` 从集群中移除已部署的资源(Deployment、Service、Ingress 等)。
|
||||
- **用法**:用部署时的 YAML 删除,与 `apply` 一一对应;或按资源类型和名称逐个删除。
|
||||
- **示例**:
|
||||
- `kubectl delete -f nginx-matrix.yaml`:删除该文件定义的所有资源
|
||||
- `kubectl delete -f ansible/files/nginx-matrix/ -R`:递归删除该目录下所有 manifest 定义的资源(02-05 矩阵)
|
||||
- `kubectl delete -f ansible/files/nginx-matrix-tls/ -R`:删除 03-02 TLS 矩阵(或见该文档 / playbook `nginx-matrix-tls-deploy.yml -e mode=cleanup`)
|
||||
- `kubectl delete deployment nginx-m1 -n default`:按名称删除单个 Deployment
|
||||
- **用途**:清理测试应用、下线服务、重装部署前先删除旧资源。资源删除后对应 Pod 会被终止,数据(etcd 中记录)一并移除;若用了 PVC,PVC 本身通常需单独删除。
|
||||
|
||||
参考:`02-05-nginx-验证矩阵-一键部署.md` 删除小节。
|
||||
|
||||
## 10. 下一步
|
||||
|
||||
- `01-01-k3s-控制节点含traefik.md`
|
||||
- `01-02-k3s-工作节点.md`
|
||||
|
||||
|
||||
211
docs/00-02-验证矩阵.md
Normal file
211
docs/00-02-验证矩阵.md
Normal file
@@ -0,0 +1,211 @@
|
||||
# 00-02-验证矩阵
|
||||
|
||||
> 这一页只做一件事:**集中标记每篇关键文档是否已经在真实环境验证过**。
|
||||
>
|
||||
> **清单位置**:可部署的 Kubernetes YAML 以仓库 [`ansible/files/`](../ansible/files/) 为唯一真源(与 `docs/` 交叉引用);验证时请以该目录下文件为准。
|
||||
>
|
||||
> 写文档的人、做实验的人,都以这里为准,不用在每篇文档里翻记录。
|
||||
|
||||
## 状态说明
|
||||
|
||||
- **❓ 未验证**:内容结构与命令已经写好,但**还没有**在目标环境完整跑通一次。
|
||||
- **⚠️ 部分验证**:只验证了其中一部分场景(例如只在单节点环境跑过,或只验证了 HTTP 未验证 HTTPS),备注里会写明覆盖范围。
|
||||
- **✅ 已验证**:按该文档从头到尾在指定环境走完一遍,达到预期结果,备注里会带上环境与日期。
|
||||
|
||||
建议习惯:
|
||||
|
||||
- 真机按文档全部走完后,再把状态从“未验证/部分验证”改成“已验证”,并写清 **OS / K3s 版本 / 时间**。
|
||||
- 以后如果对文档步骤做了较大调整,记得把这里对应条目先打回“未验证”或“部分验证”,等新流程再跑一遍。
|
||||
|
||||
---
|
||||
|
||||
## 1. 主线安装(01-*)
|
||||
|
||||
- `00-01-k3s-基础概念.md`
|
||||
- 状态:✅ 已验证
|
||||
- 备注:概念性文档,不涉及命令执行。
|
||||
- `00-04-部署环境说明.md`
|
||||
- 状态:✅ 已验证
|
||||
- 备注:说明性文档,描述本仓库验证环境(ylc61~64、Fedora、K3s v1.34.5+k3s1、/storage 等),与当前实际部署一致。
|
||||
- `01-01-k3s-控制节点含traefik.md`
|
||||
- 状态:✅ 已验证
|
||||
- 备注:Fedora 43 Server + K3s v1.34.5+k3s1,单控制节点 61,已按文档装机并确认 Traefik 入口 404 可达(2026-03-10 左右)。
|
||||
- `01-07-节点初始化-ansible-实践.md`
|
||||
- 状态:✅ 已验证
|
||||
- 备注:Fedora + K3s,4 节点(ylc61~64),Ansible 一键完成初始化、server/agent 安装、firewalld 基线、Traefik 标签及验证输出(2026-03 左右)。
|
||||
- `01-02-k3s-工作节点.md`
|
||||
- 状态:✅ 已验证
|
||||
- 备注:在同一环境下成功加入工作节点 62,并通过 `kubectl get nodes` 看到双节点 Ready(2026-03-10 左右)。
|
||||
- `01-03-armv7-standalone-docker.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:待在实际 armv7 设备上按文档安装 Docker 并跑一两个容器后更新。
|
||||
- `01-04-cloudflare-tunnel.md`
|
||||
- 状态:⚠️ 部分验证
|
||||
- 备注:Cloudflare 控制台端(Tunnel/域名)已实践使用,需在新环境对完整安装准备流程再跑一遍。
|
||||
- `01-08-openwrt-haproxy.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:OpenWrt 网关负载均衡,转发 80/443 到 K3s 节点;2026-03 文档调整(健康检查 TCP/HTTP/TLS/HTTPS 四类、send-proxy-v2 示例),待在实际 OpenWrt 设备上验证。
|
||||
|
||||
---
|
||||
|
||||
## 2. 简单部署nginx(02-*)
|
||||
|
||||
- `02-00-nginx-系列说明.md`
|
||||
- 状态:❓ 未验证(说明性文档)
|
||||
- 备注:整理节点调度与 Ingress/IngressRoute 差异,后续按需补齐验证信息。
|
||||
- `02-01-nginx-control-ingress.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:待在控制节点上按文档部署 nginx + Ingress,并通过 curl/浏览器验证。
|
||||
- `02-02-nginx-control-ingressroute.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:同上,使用 IngressRoute 验证基本路由链路。
|
||||
- `02-03-nginx-worker-ingress.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:待在工作节点流量路径上完成 nginx Ingress 验证。
|
||||
- `02-04-nginx-worker-ingressroute.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:同上,IngressRoute 变体。
|
||||
- `02-05-nginx-验证矩阵-一键部署.md`
|
||||
- 状态:✅ 已验证(4 种组合 M1~M4 整合)
|
||||
- 备注:HTTP-only(无域名学习);有域名时用 03-02 升级版。
|
||||
---
|
||||
|
||||
## 3. k3s 常用配置
|
||||
|
||||
- `02-00-nginx-系列说明.md`
|
||||
- 状态:❓ 未验证(说明性文档)
|
||||
- 备注:整理节点调度与 Ingress/IngressRoute 差异(nodeSelector/labels/tolerations 通用排查思路),后续按需补齐验证信息。
|
||||
- `03-01-k3s-traefik-dashboard.md`
|
||||
- 状态:✅ 已验证
|
||||
- 备注:在 61/62/63/64 环境各节点启用过 Dashboard 并确认能访问,日志正常。模板:`ansible/files/traefik-dashboard/traefik-dashboard.yaml`。
|
||||
- `03-02-k3s-traefik-acme.md`
|
||||
- 状态:✅ 已验证
|
||||
- 备注:02-05 的升级版(TLS 矩阵 + ACME);2026-03 实机跑通。
|
||||
- `03-03-k3s-traefik-dashboard-acme.md`
|
||||
- 状态:⚠️ 部分验证
|
||||
- 备注:ACME 配置已与 03-02 对齐(03-02 已实机验证);Dashboard + ACME 合并流程待实机跑一遍。模板:`ansible/files/traefik-dashboard-acme/traefik-dashboard-acme.yaml`。
|
||||
- `03-04-k3s-cloudflare-tunnel-配置接入.md`
|
||||
- 状态:⚠️ 部分验证
|
||||
- 备注:cloudflared 侧部署与 Tunnel 接入已在其他项目跑通过,本实验室集群的完整接入流程待实机验证。
|
||||
- `03-05-k3s-local-path-pvc.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:K3s 自带 local-path-provisioner,PVC 本地持久化;待实机验证。
|
||||
- `03-06-k3s-使用nfs存储.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:待在实际 NFS 服务器 + K3s 集群上完成 PV/PVC + Pod 挂载验证。
|
||||
- `03-07-k3s-longhorn-持久化存储.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:Longhorn 安装与 PVC 流程待在本环境实机验证。
|
||||
- `03-08-k3s-ha-集群配置与切换.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:HA 场景步骤已整理,尚未在当前环境完成双 server + 切换演练。
|
||||
- `03-09-k3s-gitops-集群配置管理.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:框架草案,待选定 Argo CD 或 Flux 后细化。
|
||||
|
||||
### 可选:依赖文档
|
||||
|
||||
- `01-05-双控制节点ha.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:文档已拆分安装/配置流程,尚未在双控制节点 + 外部 LB 的完整场景下全链路验证。
|
||||
- `01-06-armv7-nfs服务安装.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:NFS 安装命令已经过以往经验验证,本仓库对应 armv7 环境需再跑一遍确认导出与权限。
|
||||
|
||||
---
|
||||
|
||||
## 4. 高级 Node.js(04-01~04-14)
|
||||
|
||||
- `04-01-k3s-nodejs-高级部署.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:主入口;具体能力验证依赖 `04-02`~`04-14` 分项。
|
||||
- `04-02-nodejs-镜像与运行命令.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:镜像 tag/`imagePullPolicy`/`command`/`args` 在实机拉取与启动验证。
|
||||
- `04-03-nodejs-环境变量与配置注入.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:ConfigMap/Secret 注入与 `printenv`/`curl` 结果一致。
|
||||
- `04-04-nodejs-端口与Service.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:`targetPort` 与进程监听一致;Endpoints 有地址。
|
||||
- `04-05-nodejs-资源请求与限制.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:`kubectl top` 与 OOM/节流行为符合预期。
|
||||
- `04-06-nodejs-探针与健康检查.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:readiness/liveness 与 Endpoint/重启行为验证。
|
||||
- `04-07-nodejs-调度与亲和.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:`nodeSelector`/亲和/容忍与节点标签实机一致。
|
||||
- `04-08-nodejs-安全上下文.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:非 root/只读根等策略下应用仍可运行。
|
||||
- `04-09-nodejs-存储与卷.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:PVC/emptyDir 挂载与读写、配合 `03-05`/`03-07` 存储选型。
|
||||
- `04-10-nodejs-Ingress与Traefik.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:path/host/入口点注解与 Traefik 路由一致。
|
||||
- `04-11-nodejs-副本与滚动发布.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:多副本与 `rollout`/`undo` 实机验证。
|
||||
- `04-12-nodejs-TLS与证书.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:HTTPS 与 `03-02` ACME/Secret 配合验证证书与域名。
|
||||
- `04-13-nodejs-HPA.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:metrics-server 可用;压测触发扩缩。
|
||||
- `04-14-nodejs-GitOps与CI流水线.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:流程文档;按 `05-03`/`05-04`/`03-09` 任选一条链路实机跑通后更新。
|
||||
|
||||
---
|
||||
|
||||
## 5. 常用应用与监控(05-*)
|
||||
|
||||
- `05-01-k3s-部署homer首页面板.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:待在集群内按文档部署 Homer,并确认首页可访问。
|
||||
- `05-02-onenav首页面板.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:包含 armv7 独立部署 + K3s 反向代理两个部分,需分别验证。
|
||||
- `05-03-k3s-安装gitlab-含runner.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:待完成 GitLab + Runner 安装与基础流水线运行。
|
||||
- `05-04-k3s-配置gitlab-cicd.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:需在真实仓库上跑通一次 K3s 部署流水线。
|
||||
- `05-05-prometheus与grafana.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:待完成 kube-prometheus-stack 安装与 Dashboard 访问。
|
||||
- `05-06-openlist挂载网盘与自动备份.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:待在实际网盘与备份目录上验证周期备份任务。
|
||||
- `05-07-openclaw应用部署.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:待在 x86 主机用 Docker 部署 OpenClaw,并在 K3s 中完成静态转发验证。
|
||||
- `05-08-openclaw-k3s-实验部署.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:待在 K3s 内按实验文档直接部署 OpenClaw Gateway,并确认入口可访问。
|
||||
|
||||
---
|
||||
|
||||
## 6. 排障与运维(06-*)
|
||||
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
- 状态:✅ 已验证
|
||||
- 备注:已在 Fedora 43 + K3s 环境排查并修复过“62:80 不通 / firewalld 拦截 flannel.1 <-> cni0”的问题,脚本与命令均来自实战过程。
|
||||
- `06-02-运维小结.md`
|
||||
- 状态:❓ 未验证
|
||||
- 备注:运维建议为经验总结,后续可在日常巡检/备份流程固化后逐条打勾。
|
||||
|
||||
---
|
||||
|
||||
## 8. 如何更新本矩阵
|
||||
|
||||
- 修改某篇文档的关键步骤(尤其是“操作步骤 / 验证命令 / 预期”)时:
|
||||
- 记得同步更新这里对应条目的“状态”和“备注”。
|
||||
- 大改后建议先把状态退回“未验证”或“部分验证”,等新流程在实机跑完再改回“已验证”。
|
||||
- 执行中文文档一键安全对齐或大规模内容调整时,建议把 **验证矩阵** 一起纳入检查范围,避免出现“文档已经改了,但矩阵还显示已验证”的错觉。
|
||||
|
||||
|
||||
111
docs/00-03-未来规划与待补功能.md
Normal file
111
docs/00-03-未来规划与待补功能.md
Normal file
@@ -0,0 +1,111 @@
|
||||
# 00-03-未来规划与待补功能
|
||||
|
||||
> 给未来的自己:这里不是“必须现在就做完”的清单,而是把你已经想到、但还没系统实现的能力先写下来,等有时间再一项项补。
|
||||
|
||||
## 1. 日志与审计体系
|
||||
|
||||
- **现状**
|
||||
- 主要依赖 `kubectl logs` + 节点本地日志 + Prometheus/Grafana 指标。
|
||||
- 没有集中日志查询入口,也没有明确的“关键操作审计”路径。
|
||||
- **规划方向**
|
||||
- 引入轻量日志聚合(例如 Loki 或 ELK 中的一个最小栈),统一收集:
|
||||
- K3s 控制面与核心组件日志;
|
||||
- 关键应用(GitLab、openlist、OpenClaw 等)的访问/错误日志。
|
||||
- 为“集群操作日志”(如 `kubectl apply/delete`)预留出口,后续可结合 GitOps 做审计。
|
||||
- **建议文档**
|
||||
- `05-09-k3s-集中日志与查询-loki.md`(示例名称)
|
||||
|
||||
## 2. 统一身份与权限管理(SSO)
|
||||
|
||||
- **现状**
|
||||
- GitLab、Grafana、Homer、openlist 等各自维护账号。
|
||||
- Cloudflare Zero Trust 只覆盖到部分 Web 入口,没有形成统一的“家庭账号体系”。
|
||||
- **规划方向**
|
||||
- 引入一个轻量 IdP(如 Keycloak / Authentik),集中管理家庭成员账号与 OAuth/OIDC 客户端。
|
||||
- 按优先级为关键组件接入 SSO:
|
||||
- GitLab、Grafana、Homer 优先;
|
||||
- 其余应用按需要接入。
|
||||
- **建议文档**
|
||||
- `05-10-homelab-sso-keycloak-部署与接入.md`
|
||||
|
||||
## 3. 运维自动化与 GitOps
|
||||
|
||||
- **现状**
|
||||
- 节点初始化、K3s 配置和应用部署以“手工 + scripts/”为主。
|
||||
- 没有一套“从裸机/虚机到完整环境”的幂等自动化流程。
|
||||
- **规划方向**
|
||||
- **节点侧**:✅ 已完成 `01-07-节点初始化-ansible-实践.md`,Ansible 一键完成初始化 + k3s 安装 + firewalld 基线 + Traefik 标签(含 8472/udp、6443/tcp 端口开放)。
|
||||
- **集群侧**:引入 GitOps(Argo CD / Flux 二选一)管理:
|
||||
- K3s 核心配置与 CRD;
|
||||
- Ingress/IngressRoute、Traefik 配置;
|
||||
- 常用应用(Homer、openlist、监控、GitLab 等)的清单。
|
||||
- **建议文档**
|
||||
- `03-09-k3s-gitops-集群配置管理.md`
|
||||
|
||||
## 4. 网络边界与安全基线
|
||||
|
||||
- **现状**
|
||||
- 已有 NetworkPolicy 排障文档 `06-01-k3s-networkpolicy-故障排查.md`。
|
||||
- 家庭网络与实验网段的边界、安全分区(IoT 设备、访客网络等)主要依赖网关/OpenWrt,尚未在本仓库中系统描述。
|
||||
- **规划方向**
|
||||
- 定义一份“最小可接受安全基线”:
|
||||
- 命名空间隔离与默认拒绝策略;
|
||||
- 仅对入口、监控、GitLab 等核心组件放行必须的东西;
|
||||
- 节点对外暴露端口白名单。
|
||||
- 梳理家庭网络拓扑与 K3s 网络在其中的位置:
|
||||
- 内/外网、IoT 网段、Admin 网段;
|
||||
- 哪些通过 Cloudflare、哪些只允许 VPN。
|
||||
- **建议文档**
|
||||
- `06-04-homelab-网络分区与安全基线.md`
|
||||
|
||||
## 5. 备份与灾难恢复(超越单应用)
|
||||
|
||||
- **现状**
|
||||
- `06-03-k3s-自动备份与恢复-openlist-webdav.md` 已覆盖 openlist 的备份/恢复实践。
|
||||
- 尚未有一份“集群级 + 存储级 + 应用级”的整体 DR 方案。
|
||||
- **规划方向**
|
||||
- 明确几类不同的“灾难级别”与对应恢复路径:
|
||||
1. 单个 Pod/Deployment 配置误操作;
|
||||
2. 某一节点(worker/server)硬件/系统损坏;
|
||||
3. 存储节点(NFS/硬盘阵列)损坏;
|
||||
4. 整个 K3s 集群需要在新环境中重建。
|
||||
- 对应规划:
|
||||
- K3s datastore/外部数据库定期备份;
|
||||
- NFS/重要 hostPath 目录的文件级备份或异地同步;
|
||||
- 关键应用(GitLab、openlist、openclaw workspace 等)的专项恢复演练。
|
||||
- **建议文档**
|
||||
- `06-05-k3s-集群级备份与灾难恢复设计.md`
|
||||
|
||||
## 6. 远程访问形态:Tunnel + VPN 双轨
|
||||
|
||||
- **现状**
|
||||
- 通过 Cloudflare Tunnel 提供部分 Web 入口访问。
|
||||
- 管理/运维时仍主要依赖局域网直接访问。
|
||||
- **规划方向**
|
||||
- 保持**Cloudflare Tunnel 作为零信任 Web 入口**方案;
|
||||
- 额外增加一条 **WireGuard/OpenVPN 运维 VPN** 路径:
|
||||
- 只向极少数管理设备开放;
|
||||
- 主要用途为 SSH、kubeconfig、底层网络排障。
|
||||
- **建议文档**
|
||||
- `01-08-wireguard-运维vpn-接入与实践.md`
|
||||
|
||||
## 7. 其他可选实验方向
|
||||
|
||||
> 这些不是“缺失”,而是你以后如果有时间,可以尝试的升级路线。
|
||||
|
||||
- **多集群/多环境管理**:
|
||||
- 在本地再起一个极简 K3s/Kind,用作“预生产/实验”环境,通过 GitOps 控制与主集群的差异。
|
||||
- **存储升级**:
|
||||
- 从基础 NFS 逐步尝试 Longhorn、Rook-Ceph 或轻量分布式存储,评估在家庭环境下的性价比与复杂度。
|
||||
- **可观测性增强**:
|
||||
- 在现有 Prometheus/Grafana 基础上补充 Alertmanager 与简单告警策略(如节点离线、磁盘空间、关键 Pod 异常)。
|
||||
|
||||
---
|
||||
|
||||
## 8. 使用方式建议
|
||||
|
||||
- 不必一次全部实现,可按“对你当前使用最有帮助的”优先级来选;
|
||||
- 每当某个方向完成初版实践时,在 `00-02-验证矩阵.md` 中补充状态与备注;
|
||||
- 新增文档时记得回到 `00-00-构建总览.md`,把入口挂上。
|
||||
|
||||
|
||||
55
docs/00-04-部署环境说明.md
Normal file
55
docs/00-04-部署环境说明.md
Normal file
@@ -0,0 +1,55 @@
|
||||
# 00-04-部署环境说明
|
||||
|
||||
> 本文描述本仓库文档所针对的**验证环境**:节点布局、IP、OS、K3s 版本等。其他环境按需对照调整。
|
||||
|
||||
## 1. 节点与角色
|
||||
|
||||
| 主机名 | IP | 角色 | 说明 |
|
||||
| ----- | ------------ | ---------- | -------------------------- |
|
||||
| ylc61 | 192.168.2.61 | k3s server | 控制节点,运行 API、etcd、Traefik 等 |
|
||||
| ylc62 | 192.168.2.62 | k3s worker | 工作节点 |
|
||||
| ylc63 | 192.168.2.63 | k3s worker | 工作节点 |
|
||||
| ylc64 | 192.168.2.64 | k3s worker | 工作节点 |
|
||||
|
||||
|
||||
- Kubernetes 中的节点名使用短主机名(如 `ylc61`~`ylc64`),与 inventory 中主机名一致,便于配合 Cloudflare CDN(若计算机 hostname 为 FQDN,本机解析会优先走本地导致无法访问)。
|
||||
- 控制机(运行 `ansible-playbook`)可任选一台,通常为 ylc61 或本机。
|
||||
|
||||
## 2. 软件版本(已验证)
|
||||
|
||||
|
||||
| 组件 | 版本 | 备注 |
|
||||
| ------- | ----------------- | --------------------------- |
|
||||
| OS | Fedora 43 Server (CoreOS) | 其他 RHEL 系 / Debian 系按文档说明适配 |
|
||||
| K3s | v1.34.5+k3s1 | 来自 get.k3s.io 默认 |
|
||||
| Ansible | ansible-core 2.18 | 用于 `01-07` 自动化安装 |
|
||||
|
||||
|
||||
## 3. 网络与存储
|
||||
|
||||
- **网段**:192.168.2.0/24
|
||||
- **可选**:OpenWrt 网关(如 192.168.2.1)上配置 HAProxy 负载均衡,将 80/443 转发到 K3s 节点,见 `01-08-openwrt-haproxy.md`
|
||||
- **数据盘方案**:`/storage`,server 与 worker 均使用 `--data-dir=/storage`
|
||||
- **token 路径**:`/storage/server/token`
|
||||
|
||||
## 4. 防火墙
|
||||
|
||||
- **firewalld**:启用
|
||||
- **已放行端口**:
|
||||
- 6443/tcp(k3s API,仅 server)
|
||||
- 8472/udp(flannel VXLAN,全部节点)
|
||||
- flannel.1、cni0 加入 trusted zone
|
||||
|
||||
## 5. Ansible 相关
|
||||
|
||||
- **inventory**:`ansible/inventory.ini`,分组 `k3s_server`、`k3s_worker`、`k3s_nodes`
|
||||
- **变量**:`ansible/group_vars/all.yml`,含 `k3s_data_dir`、`k3s_server_ip`、`k3s_manage_`* 等
|
||||
- **playbook(k3s)**:`ansible/playbooks/k3s-init-and-install.yml`
|
||||
- **playbook(nginx 矩阵)**:`ansible/playbooks/nginx-matrix-deploy.yml`(manifests 在 `ansible/files/nginx-matrix/`,文档 `02-05`)
|
||||
- **playbook(nginx TLS 矩阵)**:`ansible/playbooks/nginx-matrix-tls-deploy.yml`(manifests 在 `ansible/files/nginx-matrix-tls/`,文档 `03-02`(02-05 升级版))
|
||||
- **SSH**:root 连接,`scripts/ssh/setup-k3s-workers-ssh.sh` 预配密钥
|
||||
|
||||
## 6. 验证时间
|
||||
|
||||
- 2026-03:4 节点集群按 `01-07` 一次性安装成功,各节点 Traefik 入口 404 可达。
|
||||
|
||||
156
docs/01-01-k3s-控制节点含traefik.md
Normal file
156
docs/01-01-k3s-控制节点含traefik.md
Normal file
@@ -0,0 +1,156 @@
|
||||
# 01-01-k3s-控制节点含traefik
|
||||
|
||||
> 在控制节点安装 K3s Server,确认基础组件与 Traefik 可用。
|
||||
>
|
||||
> 若需一键自动化安装多节点集群,可直接用 `01-07-节点初始化-ansible-实践.md`。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 控制节点系统已完成基础网络配置
|
||||
- 可使用 `sudo`,并可访问公网或本地镜像源
|
||||
- 节点时间已同步(NTP)
|
||||
- **方案二(数据盘)**:若使用自定义存储目录,需先挂载数据盘并创建 `/storage`(如 10G 系统盘 + 128G 数据盘场景)
|
||||
|
||||
## 存储方案说明
|
||||
|
||||
K3s 默认将数据(含 local-path 卷)放在 `--data-dir` 下。系统盘较小时,可将数据目录放到数据盘(如 `/storage`),避免占满系统盘。
|
||||
|
||||
| 方案 | 数据目录 | 适用场景 |
|
||||
|------|----------|----------|
|
||||
| **方案一(默认)** | `/var/lib/rancher/k3s` | 系统盘空间充足 |
|
||||
| **方案二(数据盘)** | `/storage` | 系统盘小,数据盘单独挂载在 `/storage` |
|
||||
|
||||
> 自定义 `/storage` 仅解决单节点内系统盘/数据盘分离;节点或数据盘重建后数据不会自动迁移,高可用与备份见 `01-05`、`06-03`。
|
||||
|
||||
## 操作步骤
|
||||
|
||||
1. 在控制节点安装 K3s Server(默认包含 Traefik)
|
||||
2. 等待核心组件进入 Running
|
||||
3. 记录节点 IP,供后续工作节点加入和入口验证
|
||||
|
||||
### 方案一:默认数据目录
|
||||
|
||||
```bash
|
||||
curl -sfL https://get.k3s.io | sh -
|
||||
```
|
||||
|
||||
### 方案二:数据盘(自定义数据目录)
|
||||
|
||||
确保数据盘已挂载到 `/storage` 后执行:
|
||||
|
||||
```bash
|
||||
curl -sfL https://get.k3s.io | sh -s - server --data-dir=/storage
|
||||
```
|
||||
|
||||
- 使用方案二时,token 路径为 `/storage/server/token`(供 01-02 工作节点加入与 01-05 HA 使用)。
|
||||
|
||||
## 配置 kubectl(供当前用户使用)
|
||||
|
||||
安装后 K3s 生成的 kubeconfig 在 `/etc/rancher/k3s/k3s.yaml`,默认仅 root 可读。若希望**当前用户**在本机直接执行 `kubectl`(无需 sudo),可任选其一:
|
||||
|
||||
**方式一:复制到用户目录并设 KUBECONFIG**
|
||||
|
||||
```bash
|
||||
mkdir -p ~/.kube
|
||||
sudo cp /etc/rancher/k3s/k3s.yaml ~/.kube/config
|
||||
sudo chown $(id -u):$(id -g) ~/.kube/config
|
||||
chmod 600 ~/.kube/config
|
||||
export KUBECONFIG=~/.kube/config
|
||||
|
||||
# 若希望每次登录自动生效,可把下面这一行(整行,不要带 #)写入 ~/.bashrc 或 ~/.profile:
|
||||
# export KUBECONFIG=~/.kube/config
|
||||
# 写入示例: echo 'export KUBECONFIG=~/.kube/config' >> ~/.bashrc
|
||||
```
|
||||
|
||||
**方式二:仅用 KUBECONFIG 指向原路径(需 root 放宽该文件权限)**
|
||||
|
||||
```bash
|
||||
sudo chmod 644 /etc/rancher/k3s/k3s.yaml
|
||||
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml
|
||||
```
|
||||
|
||||
之后即可用 `kubectl get nodes` 等命令做验证。
|
||||
|
||||
## 其他设备使用 kubectl 控制集群
|
||||
|
||||
从**笔记本、跳板机等非控制节点**用 kubectl 管理集群时,需要:① 在该设备上**安装 kubectl**;② 有一份 kubeconfig,且其中的 API 地址指向控制节点(不能是 127.0.0.1)。
|
||||
|
||||
**0. 在其他设备上安装 kubectl**
|
||||
|
||||
在要执行 `kubectl` 的那台机器上安装 kubectl(仅需一次):
|
||||
|
||||
- **Linux(通用)**:从官方 release 下载与集群版本相近的二进制(建议与 K3s 自带的 Kubernetes 版本一致),放入 PATH:
|
||||
|
||||
```bash
|
||||
# 以 v1.28 为例,按实际 K3s 的 Kubernetes 版本选择
|
||||
curl -LO https://dl.k8s.io/release/v1.28.0/bin/linux/amd64/kubectl
|
||||
chmod +x kubectl
|
||||
sudo mv kubectl /usr/local/bin/
|
||||
kubectl version --client
|
||||
```
|
||||
|
||||
- **Debian/Ubuntu**:可用包管理器安装(版本可能略旧):
|
||||
|
||||
```bash
|
||||
sudo apt-get update && sudo apt-get install -y apt-transport-https ca-certificates curl
|
||||
curl -fsSL https://pkgs.k8s.io/core/stable/deb/Release.key | sudo gpg --dearmor -o /usr/share/keyrings/kubernetes-apt-keyring.gpg
|
||||
echo "deb [signed-by=/usr/share/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core/stable/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list
|
||||
sudo apt-get update && sudo apt-get install -y kubectl
|
||||
```
|
||||
|
||||
- **macOS**:`brew install kubectl`(或从 [Kubernetes 官方文档](https://kubernetes.io/zh-cn/docs/tasks/tools/install-kubectl/) 选择其他方式)。
|
||||
|
||||
- **Windows**:可用 `winget install Kubernetes.kubectl` 或从 [官方文档](https://kubernetes.io/zh-cn/docs/tasks/tools/install-kubectl-windows/) 下载二进制并加入 PATH。
|
||||
|
||||
**1. 在控制节点上准备可外连的 kubeconfig**
|
||||
|
||||
在控制节点执行(将 `控制节点IP或主机名` 改为实际地址,如 `192.168.2.61` 或 `ylc61`):
|
||||
|
||||
```bash
|
||||
sudo sed 's/127.0.0.1/控制节点IP或域名/' /etc/rancher/k3s/k3s.yaml > /tmp/k3s-for-remote.yaml
|
||||
sudo chmod 644 /tmp/k3s-for-remote.yaml
|
||||
```
|
||||
|
||||
**2. 拷贝到其他设备**
|
||||
|
||||
在**其他设备**上执行(需能 SSH 到控制节点,或通过 U 盘/SCP 等方式拿到文件):
|
||||
|
||||
```bash
|
||||
# 示例:从控制节点拉取到本机
|
||||
scp 用户@控制节点IP或域名:/tmp/k3s-for-remote.yaml ~/.kube/config
|
||||
# 若本机尚无 ~/.kube 目录
|
||||
mkdir -p ~/.kube
|
||||
# 再执行上述 scp,并设置权限
|
||||
chmod 600 ~/.kube/config
|
||||
```
|
||||
|
||||
**3. 本机使用**
|
||||
|
||||
```bash
|
||||
export KUBECONFIG=~/.kube/config
|
||||
# 可写入 ~/.bashrc / ~/.zshrc
|
||||
kubectl get nodes
|
||||
```
|
||||
|
||||
**注意**:其他设备需能访问控制节点的 **6443** 端口(K3s API)。若中间有防火墙,需放行控制节点 6443;若用域名,需能解析到控制节点 IP。
|
||||
|
||||
## 验证命令
|
||||
|
||||
若已按上节配置当前用户的 kubectl,可直接执行 `kubectl`;否则使用 `sudo kubectl`。
|
||||
|
||||
```bash
|
||||
kubectl get nodes -o wide
|
||||
kubectl -n kube-system get pods -o wide
|
||||
kubectl -n kube-system get deploy,svc traefik -o wide
|
||||
curl -I --max-time 3 http://127.0.0.1:80
|
||||
```
|
||||
|
||||
## 预期
|
||||
|
||||
- `kubectl get nodes` 显示控制节点为 `Ready`
|
||||
- `kube-system` 命名空间核心组件正常运行
|
||||
- Traefik 服务已创建并可响应(常见为 `404`,表示入口已通)
|
||||
|
||||
## 下一步
|
||||
|
||||
- 继续 `01-02-k3s-工作节点.md`
|
||||
147
docs/01-02-k3s-工作节点.md
Normal file
147
docs/01-02-k3s-工作节点.md
Normal file
@@ -0,0 +1,147 @@
|
||||
# 01-02-k3s-工作节点与 Traefik 部署配置
|
||||
|
||||
> 本文已合并原 `01-02-k3s-工作节点.md`。
|
||||
> 目标:完成工作节点加入 + Traefik 入口部署基线,并验证「**入口节点集合**的 `:80` 可达」。
|
||||
>
|
||||
> 若需一键自动化安装多节点集群,可直接用 `01-07-节点初始化-ansible-实践.md`。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-01-k3s-控制节点含traefik.md`
|
||||
- 已拿到 token:默认方案为 `/var/lib/rancher/k3s/server/token`;若控制节点采用**数据盘方案**则为 `/storage/server/token`
|
||||
- 控制节点可执行 `kubectl`
|
||||
- **方案二(数据盘)**:若工作节点也使用数据盘,需先挂载数据盘并创建 `/storage`
|
||||
|
||||
## 工作节点加入集群(在工作节点执行)
|
||||
|
||||
与 01-01 存储方案一致:控制节点用默认则工作节点用方案一;控制节点用数据盘则建议工作节点也用方案二,便于统一路径。
|
||||
|
||||
### 方案一:默认数据目录
|
||||
|
||||
```bash
|
||||
curl -sfL https://get.k3s.io | \
|
||||
K3S_URL=https://192.168.2.61:6443 \
|
||||
K3S_TOKEN=<TOKEN> \
|
||||
sh -
|
||||
```
|
||||
|
||||
### 方案二:数据盘(自定义数据目录)
|
||||
|
||||
确保数据盘已挂载到 `/storage` 后执行:
|
||||
|
||||
```bash
|
||||
curl -sfL https://get.k3s.io | sh -s - agent \
|
||||
--data-dir=/storage \
|
||||
--server https://192.168.2.61:6443 \
|
||||
--token <TOKEN>
|
||||
```
|
||||
|
||||
将 `<TOKEN>` 替换为从控制节点读取的 token 内容;若控制节点使用数据盘,token 路径为 `/storage/server/token`。
|
||||
|
||||
## 防火墙基线(部署即做)
|
||||
|
||||
> **执行节点**:在**每台**运行 k3s 且使用 firewalld 的节点上执行(控制节点 + 工作节点)。
|
||||
> **是否必须**:使用 firewalld 的发行版(如 FCOS/Fedora)**必须**执行,否则跨节点流量可能被拦截;未使用 firewalld 的节点可跳过。
|
||||
> 防火墙与下方 Traefik 入口**两者都需要做**,不是二选一。防火墙部分:2.1 脚本 或 2.2 手动,**二选一**即可。
|
||||
|
||||
说明(FCOS/Fedora 重点):
|
||||
|
||||
- FCOS/Fedora 默认 firewalld 转发策略较严格。
|
||||
- K3s 负责集群网络组件(如 flannel、kube-proxy)与规则下发,但**不会自动完成你宿主机 firewalld 的 zone 接口归类策略**。
|
||||
- 入口 Pod(Traefik/svclb-traefik)可能调度到任意节点,且回包路径会经过该节点本地的 `flannel.1` / `cni0`,不一定绕控制节点。
|
||||
- 若某个节点上 `flannel.1 <-> cni0` 的转发被 firewalld 拦截,该节点上的入口流量就会在某些调度/流向下异常,即使其它节点正常。
|
||||
|
||||
### 2.1 脚本方式(推荐)
|
||||
|
||||
脚本实现:等待 `flannel.1`、`cni0` 出现后,将其加入 firewalld 的 `trusted` 区域,并**持久化**(`--permanent` + `--reload`),重启后仍生效。
|
||||
早期排障时,曾只在控制节点手工执行过少量临时放行命令即可恢复访问,那是因为当时入口 Pod 全在控制节点、所有回包都经由控制节点;
|
||||
但在“Traefik 可跑在任意节点、部分节点被选为入口节点”的设计下,每个启用 firewalld 的 k3s 节点都必须持久放行本机 `flannel.1/cni0`,否则一旦入口 Pod 或业务 Pod 调度到该节点,就可能在该节点上重现同类故障。
|
||||
|
||||
在**每台** k3s 节点上分别执行:
|
||||
|
||||
```bash
|
||||
./scripts/diag/firewalld/setup-k3s-firewalld-interfaces.sh
|
||||
```
|
||||
|
||||
### 2.2 手动方式(不使用脚本)
|
||||
|
||||
在**每台** k3s 节点上分别执行:
|
||||
|
||||
```bash
|
||||
sudo firewall-cmd --zone=trusted --add-interface=flannel.1
|
||||
sudo firewall-cmd --zone=trusted --add-interface=cni0
|
||||
sudo firewall-cmd --permanent --zone=trusted --add-interface=flannel.1
|
||||
sudo firewall-cmd --permanent --zone=trusted --add-interface=cni0
|
||||
sudo firewall-cmd --reload
|
||||
```
|
||||
|
||||
## Traefik 入口节点范围(在控制节点执行,必须做)
|
||||
|
||||
为需要对外暴露 80/443 的节点打标签,使 Traefik 的 svclb 调度到这些**入口节点**。按你的节点名和 IP 替换 `ylc61`、`ylc62` 等:
|
||||
|
||||
### 3.1 手工方式(直接 kubectl 打标签)
|
||||
|
||||
```bash
|
||||
# 示例:选择 ylc61/ylc62 作为入口节点
|
||||
kubectl label node ylc61 svccontroller.k3s.cattle.io/enablelb=true --overwrite
|
||||
kubectl label node ylc62 svccontroller.k3s.cattle.io/enablelb=true --overwrite
|
||||
kubectl label node ylc61 svccontroller.k3s.cattle.io/lbpool=edge --overwrite
|
||||
kubectl label node ylc62 svccontroller.k3s.cattle.io/lbpool=edge --overwrite
|
||||
```
|
||||
|
||||
你可以根据需要选择 1 台、2 台、3 台或全部 4 台节点作为入口节点;被打上上述两个标签的节点将承载 Traefik/svclb 暴露的 80/443 入口。
|
||||
|
||||
### 3.2 Ansible 方式(推荐,集中管理入口节点)
|
||||
|
||||
也可以在 [`ansible/group_vars/all.yml`](../ansible/group_vars/all.yml) 中配置入口节点列表 `k3s_ingress_nodenames`(示例:`ylc61`、`ylc62`),由 `k3s-init-and-install.yml` 自动打标签。
|
||||
|
||||
运行:
|
||||
|
||||
```bash
|
||||
cd ansible
|
||||
ansible-playbook -i inventory.ini playbooks/k3s-init-and-install.yml
|
||||
```
|
||||
|
||||
若 `k3s_ingress_nodenames` 为空(默认),Ansible 会对**所有节点**打入口标签,与早期行为一致;
|
||||
填写后则仅对列出的节点打标签,实现「按需选择入口节点」。
|
||||
|
||||
**可选(02-05 / 03-02 矩阵 M1/M3 需要)**:若要用 `02-05-nginx-验证矩阵-一键部署.md` 或 `03-02-k3s-traefik-acme.md` 中 M1(随机一台控制节点)、M3(随机一台工作节点),需为节点打角色标签。在控制节点执行:
|
||||
|
||||
```bash
|
||||
# 控制节点打标(按实际控制节点名改,可多台)
|
||||
kubectl label node ylc61 node-role.kubernetes.io/control-plane= --overwrite
|
||||
# 工作节点打标(按实际工作节点名改,可多台)
|
||||
kubectl label node ylc62 node-role.kubernetes.io/worker= --overwrite
|
||||
kubectl label node ylc63 node-role.kubernetes.io/worker= --overwrite
|
||||
```
|
||||
|
||||
未打标时,M1/M3 会 Pending,可改用 M2/M4 的 hostname 指定节点或见 02-05 排障小节。
|
||||
|
||||
## 验证命令(在控制节点执行)
|
||||
|
||||
```bash
|
||||
sudo systemctl status k3s-agent --no-pager
|
||||
kubectl get nodes -o wide
|
||||
kubectl -n kube-system get svc traefik -o wide
|
||||
kubectl -n kube-system get pods -l app=svclb-traefik -o wide
|
||||
sudo firewall-cmd --zone=trusted --list-interfaces
|
||||
|
||||
# 对入口节点的 IP 做 HTTP 验证(示例:入口节点为 192.168.2.61 / 192.168.2.62)
|
||||
curl -I --max-time 3 http://192.168.2.61:80
|
||||
curl -I --max-time 3 http://192.168.2.62:80
|
||||
```
|
||||
|
||||
## 预期
|
||||
|
||||
- 工作节点显示 `Ready`
|
||||
- `trusted` 中可看到 `flannel.1 cni0`
|
||||
- 被标记为入口节点的 IP:80(示例中为 `192.168.2.61`、`192.168.2.62`)可返回 Traefik 响应(常见 `404`)
|
||||
|
||||
## 失败排查
|
||||
|
||||
- 若出现 `502/跨节点不通/admin-prohibited`,看:`06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 下一步
|
||||
|
||||
- `03-01-k3s-traefik-dashboard.md`
|
||||
- `04-03-k3s-nginx-demo.md`
|
||||
31
docs/01-03-armv7-standalone-docker.md
Normal file
31
docs/01-03-armv7-standalone-docker.md
Normal file
@@ -0,0 +1,31 @@
|
||||
# 01-03-armv7-standalone-Docker
|
||||
|
||||
> armv7 节点不加入 K3s,单独运行 Docker 服务(NFS、OneNav、openlist 等)。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- armv7 节点网络可达
|
||||
- 系统可安装 Docker
|
||||
|
||||
## 操作步骤
|
||||
|
||||
1. 安装 Docker
|
||||
2. 启用并启动 Docker 服务
|
||||
3. 按需部署业务容器
|
||||
|
||||
## 验证命令
|
||||
|
||||
```bash
|
||||
docker version
|
||||
docker ps
|
||||
```
|
||||
|
||||
## 预期
|
||||
|
||||
- Docker 可用
|
||||
- 容器可正常启动
|
||||
|
||||
## 下一步
|
||||
|
||||
- `05-02-onenav首页面板.md`
|
||||
- `01-06-armv7-nfs服务安装.md`
|
||||
44
docs/01-04-cloudflare-tunnel.md
Normal file
44
docs/01-04-cloudflare-tunnel.md
Normal file
@@ -0,0 +1,44 @@
|
||||
# 01-04-Cloudflare Tunnel
|
||||
|
||||
> 本文只负责 Cloudflare Tunnel 的安装准备与云端侧创建。
|
||||
> K3s 侧 `cloudflared` 部署与验证见:`03-04-k3s-cloudflare-tunnel-配置接入.md`。
|
||||
|
||||
---
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 控制节点已就绪:`01-01-k3s-控制节点含traefik.md`
|
||||
- Traefik 已可用(单节点 K3s 也可使用 Cloudflare Tunnel)
|
||||
- 域名已托管在 Cloudflare
|
||||
- 已创建 Cloudflare Zero Trust 账号
|
||||
|
||||
---
|
||||
|
||||
## 云端创建 Tunnel
|
||||
|
||||
1. 在 Cloudflare Zero Trust 创建一个 Tunnel
|
||||
2. 记录 `Tunnel Token` 或凭据 JSON
|
||||
3. 规划域名映射,例如:
|
||||
- `git.example.com` -> `http://traefik.kube-system.svc.cluster.local`
|
||||
- `home.example.com` -> 同上
|
||||
|
||||
## 安装准备检查
|
||||
|
||||
- 确认已拿到 Tunnel Token 或凭据文件
|
||||
- 确认域名与子域映射规划完成
|
||||
- 确认目标入口指向 Traefik(后续在 K3s 中接入)
|
||||
|
||||
---
|
||||
|
||||
## 注意事项
|
||||
|
||||
- 没有 token/凭据:回到 Zero Trust 页面重新生成
|
||||
- 子域规划混乱:先固定一张映射表再做集群接入
|
||||
- 需要部署到 K3s:转到 `03-04-k3s-cloudflare-tunnel-配置接入.md`
|
||||
|
||||
---
|
||||
|
||||
## 下一步
|
||||
|
||||
- `03-04-k3s-cloudflare-tunnel-配置接入.md`
|
||||
|
||||
80
docs/01-05-双控制节点ha.md
Normal file
80
docs/01-05-双控制节点ha.md
Normal file
@@ -0,0 +1,80 @@
|
||||
# 01-05-双控制节点HA(安装与准备)
|
||||
|
||||
> 本文只讲双控制节点 HA 的安装前准备与基础环境搭建。
|
||||
> 具体集群参数切换、server 加入与迁移步骤见 `03-08-k3s-ha-集群配置与切换.md`。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-01-k3s-控制节点含traefik.md`
|
||||
- 已完成 `01-02-k3s-工作节点.md`
|
||||
- 当前集群运行稳定,可执行维护窗口
|
||||
|
||||
## 目标与边界
|
||||
|
||||
- 目标:控制平面单点故障时仍可管理集群
|
||||
- 边界:家庭网关(如 OpenWrt)可能仍是整体单点
|
||||
|
||||
## 安装准备清单
|
||||
|
||||
1. 新增第二个 server 节点(示例 `192.168.2.63`)
|
||||
2. 准备外部数据存储(MySQL/PostgreSQL/etcd)
|
||||
3. 准备 `6443` 负载均衡(HAProxy)
|
||||
4. 备份现有 token 与关键配置
|
||||
|
||||
### 外部 datastore 与 k3s server 最小示例
|
||||
|
||||
以下只给出一个“最小可参考”的 PostgreSQL + k3s server 参数示意,具体地址/账号请按你自己的环境调整:
|
||||
|
||||
- **若采用 01-01 的数据盘方案**:在 server 参数中增加 `--data-dir=/storage`,与首节点一致(第二个 server 安装时同样需要)。
|
||||
|
||||
```bash
|
||||
# 假设外部 PostgreSQL 已创建数据库与账号:
|
||||
# host=192.168.2.50 dbname=k3s user=k3s password=strong-password
|
||||
|
||||
# 在首个 server(例如 192.168.2.61)上,默认数据目录:
|
||||
sudo k3s server \
|
||||
--datastore-endpoint="postgres://k3s:strong-password@192.168.2.50:5432/k3s?sslmode=disable" \
|
||||
--tls-san 192.168.2.61 \
|
||||
--tls-san 192.168.2.62 \
|
||||
--tls-san 192.168.2.63 \
|
||||
--tls-san 192.168.2.60 # 这里示例为 LB IP
|
||||
|
||||
# 若使用数据盘方案,增加 --data-dir=/storage,例如:
|
||||
# sudo k3s server --data-dir=/storage \
|
||||
# --datastore-endpoint="postgres://..." --tls-san ...
|
||||
```
|
||||
|
||||
> 说明:上面的命令仅作为参数示意,实际部署时建议改用 systemd unit 或官方安装脚本的额外参数(`INSTALL_K3S_EXEC=...`),并结合 `03-08-k3s-ha-集群配置与切换.md` 中的步骤执行。
|
||||
|
||||
### 从现有 worker 升级为第二控制节点(推荐路径)
|
||||
|
||||
在家庭实验室环境中,第二个控制节点通常可以直接复用一台已有的 worker 节点。整体思路是:
|
||||
|
||||
1. **确认 worker 节点健康**:
|
||||
- 已按 `01-02-k3s-工作节点.md` 正常加入集群;
|
||||
- 无关键 Pod 仅运行在该节点(可先用 `kubectl drain` 或手动迁移工作负载)。
|
||||
2. **在 `01-05` 阶段完成外部 datastore 与 LB 准备**:
|
||||
- 不要立即改动现有 server/worker 的 systemd 配置,只确保 datastore/LB 均已就绪。
|
||||
3. **在 `03-09` 中按步骤将该 worker 替换为 server**:
|
||||
- 停止该节点上的 `k3s-agent` 服务(或执行官方卸载脚本);
|
||||
- 使用与首个 server 相同的 token/datastore/LB 地址重新以 `server` 角色安装 k3s;
|
||||
- 最终形成“2 个 server + 若干 worker”的目标拓扑。
|
||||
|
||||
> 具体切换命令与顺序详见:`03-08-k3s-ha-集群配置与切换.md` 中的操作步骤。
|
||||
|
||||
## 基础验证
|
||||
|
||||
```bash
|
||||
kubectl get nodes -o wide
|
||||
kubectl get pods -A
|
||||
```
|
||||
|
||||
## 风险提示
|
||||
|
||||
- 这是高级改造,建议在业务稳定后执行
|
||||
- 执行前务必做完整备份
|
||||
|
||||
## 下一步
|
||||
|
||||
- `03-08-k3s-ha-集群配置与切换.md`
|
||||
|
||||
43
docs/01-06-armv7-nfs服务安装.md
Normal file
43
docs/01-06-armv7-nfs服务安装.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# 01-06-armv7 NFS 服务安装
|
||||
|
||||
> 本文只讲 armv7 主机侧 NFS 服务安装与导出配置。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-03-armv7-standalone-docker.md`
|
||||
- armv7 与 K3s 节点网络互通
|
||||
|
||||
## 操作步骤
|
||||
|
||||
1. 在 armv7 安装 NFS 服务(nfs-utils / nfs-kernel-server)
|
||||
2. 创建导出目录(例如 `/data/nfs`)
|
||||
3. 配置 `/etc/exports`
|
||||
4. 放行 NFS 端口并启用开机自启
|
||||
|
||||
示例(按发行版调整):
|
||||
|
||||
```bash
|
||||
sudo mkdir -p /data/nfs
|
||||
sudo chown -R nobody:nogroup /data/nfs
|
||||
echo "/data/nfs 192.168.2.0/24(rw,sync,no_subtree_check,no_root_squash)" | sudo tee /etc/exports
|
||||
sudo exportfs -rav
|
||||
sudo systemctl enable --now nfs-server
|
||||
```
|
||||
|
||||
## 验证命令
|
||||
|
||||
```bash
|
||||
showmount -e localhost
|
||||
sudo exportfs -v
|
||||
sudo systemctl status nfs-server --no-pager
|
||||
```
|
||||
|
||||
## 预期
|
||||
|
||||
- `showmount -e` 可看到导出目录
|
||||
- NFS 服务为运行状态
|
||||
|
||||
## 下一步
|
||||
|
||||
- `03-06-k3s-使用nfs存储.md`
|
||||
|
||||
137
docs/01-07-节点初始化-ansible-实践.md
Normal file
137
docs/01-07-节点初始化-ansible-实践.md
Normal file
@@ -0,0 +1,137 @@
|
||||
# 01-07-节点初始化与 k3s 自动安装(Ansible 实践)
|
||||
|
||||
> 目标:给一组已经装好 OS、可以 SSH 的裸金属/虚机,**一键完成基础初始化 + 安装 k3s server/worker**,得到与 `01-01`、`01-02` 文档一致的集群(含 `/storage` 数据盘方案)。
|
||||
>
|
||||
> **状态:已验证**(2026-03,Fedora + K3s,4 节点 61~64)。
|
||||
> 部署环境详见 `00-04-部署环境说明.md`。
|
||||
|
||||
## 1. 适用边界与前提
|
||||
|
||||
- 已完成:
|
||||
- **控制机**(运行 ansible-playbook 的那台机器)已安装 Ansible:
|
||||
- Fedora / RHEL:`sudo dnf install ansible`
|
||||
- Debian / Ubuntu:`sudo apt install ansible`
|
||||
- 验证:`ansible --version`、`ansible-playbook --version`
|
||||
- 每台目标机器已经安装好 Linux(如 Fedora/CentOS/Debian 等);
|
||||
- 能通过 SSH 无密码/密钥登录(Ansible 能连通);
|
||||
- 若使用 `scripts/ssh/setup-k3s-workers-ssh.sh` 为每节点配置独立密钥,需在 inventory 中为各 host 指定 `ansible_ssh_private_key_file`(见下文示例);
|
||||
- 使用 **root** SSH 连接(`group_vars/all.yml` 中 `ansible_user: root`),配合 `scripts/ssh/setup-k3s-workers-ssh.sh` 为**所有节点**(含控制节点)配置 jack + root 的公钥;
|
||||
- IP 规划、主机名已大致确定,例如:
|
||||
- `ylc61`:k3s server,IP `192.168.2.61`
|
||||
- `ylc62` ~ `ylc64`:k3s worker,IP `192.168.2.62` ~ `192.168.2.64`
|
||||
- **数据盘**:若使用 `/storage` 方案,需在每台节点上提前挂载数据盘并创建 `/storage`;
|
||||
- 不覆盖:
|
||||
- 从「完全裸铁 + 无系统」开始的 PXE 装机;
|
||||
- 高级 HA(多 server + 外部 datastore)——仍按 `01-05`、`03-10` 执行。
|
||||
|
||||
## 2. 目录结构
|
||||
|
||||
本仓库已有 `ansible/`:
|
||||
|
||||
```text
|
||||
ansible/
|
||||
ansible.cfg # host_key_checking=False 等
|
||||
inventory.ini
|
||||
group_vars/
|
||||
all.yml
|
||||
playbooks/
|
||||
k3s-init-and-install.yml # 标准 IPv4 安装
|
||||
```
|
||||
|
||||
## 3. 示例 inventory
|
||||
|
||||
`ansible/inventory.ini`:
|
||||
|
||||
```ini
|
||||
[k3s_server]
|
||||
ylc61 ansible_host=192.168.2.61 ansible_ssh_private_key_file=~/.ssh/id_ed25519_k3s_192.168.2.61
|
||||
|
||||
[k3s_worker]
|
||||
ylc62 ansible_host=192.168.2.62 ansible_ssh_private_key_file=~/.ssh/id_ed25519_k3s_192.168.2.62
|
||||
ylc63 ansible_host=192.168.2.63 ansible_ssh_private_key_file=~/.ssh/id_ed25519_k3s_192.168.2.63
|
||||
ylc64 ansible_host=192.168.2.64 ansible_ssh_private_key_file=~/.ssh/id_ed25519_k3s_192.168.2.64
|
||||
|
||||
[k3s_nodes:children]
|
||||
k3s_server
|
||||
k3s_worker
|
||||
```
|
||||
|
||||
> 提示:上面使用短主机名(如 `ylc61`~`ylc64`),应与各节点 hostname 及 `kubectl get nodes` 输出的 NAME 一致,便于配合 Cloudflare CDN;playbook 的 Init 阶段会为所有 k3s 节点写入 /etc/hosts 条目。
|
||||
|
||||
## 4. 全局变量
|
||||
|
||||
**唯一真源**:[`ansible/group_vars/all.yml`](../ansible/group_vars/all.yml)(含 `ansible_user`、`k3s_data_dir`、`k3s_server_ip`、`k3s_manage_*` 等)。
|
||||
|
||||
若需安装后自动打 **control-plane/worker 角色标签**(供 `02-05` 与 `03-02` 的 M1/M3 使用),在同一文件中增加,例如:
|
||||
|
||||
- `k3s_manage_role_labels: true`
|
||||
- `k3s_control_plane_nodenames: ["ylc61"]`
|
||||
- `k3s_worker_nodenames: ["ylc62", "ylc63", "ylc64"]`
|
||||
|
||||
节点名必须与 `kubectl get nodes` 输出一致(使用短主机名 ylc61~ylc64)。未配置时仅打 enablelb/lbpool,不打角色标签。
|
||||
|
||||
## 5. 执行流程概览
|
||||
|
||||
playbook 依次执行:
|
||||
|
||||
| 顺序 | 阶段 | 内容 |
|
||||
|------|------|------|
|
||||
| 1 | Init | 时区、基础包、/etc/hosts、**firewalld 开放 8472/udp(全部节点)与 6443/tcp(仅 server)** |
|
||||
| 2 | Install server | 安装 k3s server(`--data-dir=/storage`) |
|
||||
| 3 | Install agent | 逐台安装 worker(`serial: 1`,`async/poll` 防止卡死) |
|
||||
| 4 | Firewalld 基线 | 等待 flannel.1/cni0 出现(最多 120s),加入 trusted zone |
|
||||
| 5 | Traefik 标签 | 从集群动态获取节点名,打 enablelb/lbpool 标签 |
|
||||
| 6 | 角色标签(可选) | 当 `k3s_manage_role_labels: true` 时,为控制节点打 control-plane、工作节点打 worker |
|
||||
| 7 | 验证 | 输出 `kubectl get nodes`、`kubectl get pods -n kube-system`、curl 各节点 HTTP |
|
||||
|
||||
**关键实现点**:
|
||||
|
||||
- **端口 8472/udp**:flannel VXLAN 所需,必须在 Init 阶段开放,否则 worker 上 flannel 无法建立 overlay,`flannel.1` / `cni0` 永远不会出现;
|
||||
- **Firewalld 基线(flannel.1/cni0 → trusted)**:FCOS/Fedora 默认 firewalld 转发策略较严格;K3s 不会自动配置宿主机 firewalld 的 zone 接口归类。入口 Pod(Traefik/svclb-traefik)可能调度到任意节点,回包路径会经过该节点本地的 `flannel.1`/`cni0`。若某节点上 `flannel.1 ↔ cni0` 的转发被 firewalld 拦截,该节点上的入口流量就会异常,即使其它节点正常。详见 `01-02-k3s-工作节点.md`;
|
||||
- **Traefik 标签**:使用 `kubectl get nodes -o jsonpath` 获取实际节点名,不依赖 inventory 主机名与 K8s 节点名一致;
|
||||
- **角色标签(可选)**:playbook 默认只打 enablelb/lbpool,**不打** `node-role.kubernetes.io/control-plane` 与 `node-role.kubernetes.io/worker`。若需 `03-01` / `03-03` nginx 矩阵的 M1/M3 能调度,可开启 `k3s_manage_role_labels` 并配置控制节点/工作节点名列表(见下),或安装后在控制节点按 01-02 可选步骤手动打标。
|
||||
- **Agent 安装**:token 通过 `slurp` 从 server 读取,`delegate_to` 到 server 执行。
|
||||
|
||||
## 6. 使用方式
|
||||
|
||||
### 6.1 SSH 前置(若未配置)
|
||||
|
||||
先运行 `scripts/ssh/setup-k3s-workers-ssh.sh`,为所有 k3s 节点(含 server)配置 jack + root 公钥及 inventory 所需的私钥。
|
||||
|
||||
### 6.2 执行 playbook
|
||||
|
||||
在 `ansible/` 目录下执行:
|
||||
|
||||
```bash
|
||||
cd ansible
|
||||
# 标准 IPv4 安装
|
||||
ansible-playbook -i inventory.ini playbooks/k3s-init-and-install.yml
|
||||
```
|
||||
|
||||
执行结束后,playbook 会输出:
|
||||
|
||||
- `kubectl get nodes`
|
||||
- `kubectl get pods -n kube-system -o wide`
|
||||
- 各节点 IP 的 curl HTTP 测试结果
|
||||
|
||||
### 6.3 手动验证(可选)
|
||||
|
||||
在 server(如 ylc61)上执行:
|
||||
|
||||
```bash
|
||||
KUBECONFIG=/etc/rancher/k3s/k3s.yaml kubectl get nodes -o wide
|
||||
KUBECONFIG=/etc/rancher/k3s/k3s.yaml kubectl get pods -n kube-system -o wide
|
||||
```
|
||||
|
||||
确认 `/storage` 方案:
|
||||
|
||||
- server 与 worker 的 k3s 数据目录均为 `/storage`;
|
||||
- token 路径为 `/storage/server/token`。
|
||||
|
||||
## 7. 下一步
|
||||
|
||||
集群就绪后,可继续阅读:
|
||||
|
||||
- `03-09-k3s-gitops-集群配置管理.md`:用 Argo CD/Flux 管理 Traefik、监控、应用清单;
|
||||
- `01-01`、`01-02` 中的验证命令与入口验证。
|
||||
|
||||
100
docs/01-08-openwrt-haproxy.md
Normal file
100
docs/01-08-openwrt-haproxy.md
Normal file
@@ -0,0 +1,100 @@
|
||||
# 01-08 OpenWrt HAProxy 负载均衡
|
||||
|
||||
> 在 OpenWrt 上安装并配置 HAProxy,将 80/443 流量转发到 K3s 集群节点(Traefik 入口),实现单一入口与负载均衡。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- OpenWrt 与 K3s 节点同网段(如 192.168.2.0/24),OpenWrt 通常为网关(如 192.168.2.1)
|
||||
- 已完成 `01-02-k3s-工作节点.md` 或 `01-07`,Traefik 入口 80/443 已在各节点可达
|
||||
|
||||
## 1. 安装 HAProxy
|
||||
|
||||
```bash
|
||||
opkg update
|
||||
opkg install haproxy
|
||||
```
|
||||
|
||||
若使用 LuCI,可在「系统」→「软件包」中搜索 `haproxy` 安装。
|
||||
|
||||
## 2. 配置
|
||||
|
||||
### 2.1 原生 HAProxy 配置(推荐)
|
||||
|
||||
编辑 `/etc/haproxy.cfg` 或包提供的配置路径(部分 OpenWrt 使用 `/etc/haproxy/haproxy.cfg`)。可在 `/etc/init.d/haproxy` 中查看实际配置文件路径。
|
||||
|
||||
**完整配置见 `ansible/files/01-08-haproxy/haproxy.cfg`**(与 Ansible 共用,可复制到 OpenWrt 或通过 playbook 下发)。将 `192.168.2.61`~`192.168.2.64` 按实际 K3s 节点 IP 修改。健康检查默认为 **TCP**,如需升级见第 3 节;如需真实客户端 IP 见第 5 节 PROXY Protocol。
|
||||
|
||||
### 2.2 UCI 配置(可选)
|
||||
|
||||
部分 OpenWrt 使用 UCI 管理 HAProxy,编辑 `/etc/config/haproxy`。UCI 结构与选项因版本而异,可参考 [OpenWrt HAProxy 文档](https://openwrt.org/docs/guide-user/services/load_balancing/haproxy)
|
||||
|
||||
## 3. 健康检查
|
||||
|
||||
分四类:**TCP**、**HTTP**、**TLS**、**HTTPS**,由浅到深。
|
||||
|
||||
| 类型 | 说明 | 适用端口 |
|
||||
|------|------|----------|
|
||||
| TCP | `server ... check`,端口能连即通过 | 80、443 等 |
|
||||
| HTTP | `option httpchk`,明文 HTTP 请求 | 80 |
|
||||
| TLS | `option ssl-hello-chk`,TLS 握手层 | 443(`mode tcp`) |
|
||||
| HTTPS | `option httpchk` + `server ... ssl`,HTTP over TLS | 443(`mode http`) |
|
||||
|
||||
说明:443 业务若为 **TCP 透传**,backend 是 `mode tcp`,只能选 TCP 或 TLS;若需 HTTPS 级检查,需另建 `mode http` 的 backend。
|
||||
|
||||
### 3.1 TCP(2.1 默认)
|
||||
|
||||
即 `ansible/files/01-08-haproxy/haproxy.cfg` 中的 backend 块。
|
||||
|
||||
### 3.2 HTTP(80 明文)
|
||||
|
||||
替换 2.1 中 `backend k3s_http` 块;frontend `http_in` 仍指向 `k3s_http`。在 backend 开头加 `option httpchk GET /`。
|
||||
|
||||
### 3.3 TLS(443 握手,`mode tcp`)
|
||||
|
||||
替换 2.1 中 `backend k3s_https` 块;frontend `https_in` 仍指向 `k3s_https`。在 backend 中加 `option ssl-hello-chk`。
|
||||
|
||||
### 3.4 HTTPS(443 应用层,`mode http` + `ssl`)
|
||||
|
||||
适用于 **HAProxy 在 443 终结 TLS** 的场景(frontend 需 `mode http` 且 `bind` 时带 ssl)。若仍为 TCP 透传,用 3.3 即可。需与 Traefik 路由匹配的 `Host`;自签/内网 CA 可用 `verify none`,生产建议 `ca-file`。
|
||||
|
||||
```haproxy
|
||||
backend k3s_https_httpchk
|
||||
mode http
|
||||
option httpchk GET / HTTP/1.1\r\nHost:\ your-ingress.example.com
|
||||
default-server ssl verify none
|
||||
server ylc61 192.168.2.61:443 check
|
||||
server ylc62 192.168.2.62:443 check
|
||||
server ylc63 192.168.2.63:443 check
|
||||
server ylc64 192.168.2.64:443 check
|
||||
```
|
||||
|
||||
## 4. 启动与验证
|
||||
|
||||
```bash
|
||||
/etc/init.d/haproxy enable
|
||||
/etc/init.d/haproxy restart
|
||||
```
|
||||
|
||||
验证:从内网访问 `http://<OpenWrt-IP>/` 或 `http://<OpenWrt-IP>/demo-m1/`(02-05 矩阵),应能到达 Traefik 与后端。
|
||||
|
||||
## 5. PROXY Protocol(可选)
|
||||
|
||||
若 Traefik 需获取真实客户端 IP,可在 HAProxy 后端每个 `server` 行添加 `send-proxy-v2`,并在 Traefik 配置 `trustedIPs` 包含 OpenWrt 网段(见 `03-02-k3s-traefik-acme.md`)。
|
||||
|
||||
**完整配置见 `ansible/files/01-08-haproxy/haproxy-proxy.cfg`**(仅 TCP 检查 + PROXY)。
|
||||
|
||||
**健康检查与 PROXY 组合**:`ansible/files/01-08-haproxy/haproxy-proxy-http-tls.cfg` 为 HTTP 检查 + TLS 检查 + PROXY 的完整示例。
|
||||
|
||||
Traefik 端需启用 PROXY protocol 监听并信任 OpenWrt 的 IP,否则会报错。UCI 配置需参考 OpenWrt HAProxy 文档中的相应选项。
|
||||
|
||||
## 6. 端口与防火墙
|
||||
|
||||
**80/443 被封**:家庭网络若被运营商封禁 80/443,可改用 18080/18443 等非标准端口,供直接访问或配合 Cloudflare CDN 端口转发。将 frontend 的 `bind *:80`、`bind *:443` 改为 `bind *:18080`、`bind *:18443` 即可;Traefik 入口及后端保持不变。
|
||||
|
||||
**防火墙**:确保 OpenWrt 放行实际监听端口(80/443 或 18080/18443 等)入站,或将 HAProxy 监听接口加入相应 zone。
|
||||
|
||||
## 相关文档
|
||||
|
||||
- `01-02-k3s-工作节点.md`:Traefik 入口与 LB 基线
|
||||
- `02-05-nginx-验证矩阵-一键部署.md`:验证矩阵(按入口 IP 访问)
|
||||
- `03-02-k3s-traefik-acme.md`:PROXY protocol、trustedIPs
|
||||
112
docs/02-00-nginx-系列说明.md
Normal file
112
docs/02-00-nginx-系列说明.md
Normal file
@@ -0,0 +1,112 @@
|
||||
# 02-00 Nginx 矩阵系列说明(节点 + Ingress / IngressRoute)
|
||||
|
||||
> 目的:先把本系列(02-01~02-05)共用的“节点调度规则”和“Traefik 路由对象差异”讲清楚,后面的 02-01~02-04 分篇才能读得更快、更少踩坑。
|
||||
|
||||
---
|
||||
|
||||
## 1. 节点与调度(M1~M4 到底落在哪)
|
||||
|
||||
本仓库的 nginx 矩阵里,4 个场景 M1~M4 的“落点”不同,主要通过 `nodeSelector` 控制。
|
||||
|
||||
### 1.1 查看 `node-role` 与 `hostname`(用于排查 nodeSelector)
|
||||
|
||||
```bash
|
||||
# 1) 最直观:查看所有节点的全部 labels
|
||||
kubectl get nodes --show-labels
|
||||
|
||||
# 2) 检查 hostname label(用于 M2/M4)
|
||||
kubectl get nodes -l kubernetes.io/hostname=<hostname> -o name
|
||||
|
||||
# 3) 检查 control-plane label(用于 M1)
|
||||
kubectl get nodes -l node-role.kubernetes.io/control-plane -o name
|
||||
|
||||
# 4) 检查 worker label(用于 M3)
|
||||
kubectl get nodes -l node-role.kubernetes.io/worker -o name
|
||||
|
||||
# 5) 最后:查看单节点全部信息(Labels / Taints / Events)
|
||||
kubectl describe node <节点名>
|
||||
```
|
||||
|
||||
提示:如果你在 M2/M4 里写的是 `nodeSelector: kubernetes.io/hostname: ylc61`,但该节点没有打上 `kubernetes.io/hostname=ylc61` 这个 label,则 M2/M4 会匹配不到。
|
||||
|
||||
### 1.2 M1 / M2:控制节点(control-plane)
|
||||
|
||||
- **M1(Ingress)**:用 `nodeSelector: node-role.kubernetes.io/control-plane: ""` + `tolerations`(允许调度到控制节点的 `NoSchedule` 污点)。
|
||||
- **M2(IngressRoute)**:用 `nodeSelector: kubernetes.io/hostname: ylc61`(指定某一台控制节点)。
|
||||
|
||||
如果你看到 M1/M2 的 Pod 一直 Pending:
|
||||
- 通常是控制节点缺少对应的 label(`node-role.kubernetes.io/control-plane` 或 hostname 不匹配)
|
||||
- 或者存在 taint 没有匹配到 M1 的 toleration
|
||||
|
||||
### 1.3 M3 / M4:工作节点(worker)
|
||||
|
||||
- **M3(Ingress)**:`nodeSelector: node-role.kubernetes.io/worker: ""`,因此会“随机落在任意 worker”。
|
||||
- **M4(IngressRoute)**:`nodeSelector: kubernetes.io/hostname: ylc64`,指定某一台 worker。
|
||||
|
||||
如果你看到 M3/M4 落不到目标节点:
|
||||
- 确认 worker 节点带有 `node-role.kubernetes.io/worker` 标签(M3)
|
||||
- 或 hostname 是否与你写入的 `ylc64` 一致(M4)
|
||||
|
||||
---
|
||||
|
||||
## 2. Ingress vs IngressRoute(Traefik 路由对象差异)
|
||||
|
||||
本项目使用 Traefik:
|
||||
- `Ingress`:Kubernetes 原生对象 `kind: Ingress`(依赖 Ingress Controller 对接)
|
||||
- `IngressRoute`:Traefik CRD 对象 `kind: IngressRoute`(需要 CRD 已启用)
|
||||
|
||||
在本系列中,它们还体现为两类“路由写法”:
|
||||
|
||||
### 2.1 Ingress(用于 M1 / M3)
|
||||
|
||||
- 采用 `spec.rules.http.paths`,典型写法是:
|
||||
- `path: /demo-mx`
|
||||
- `pathType: Prefix`
|
||||
- 中间件挂载方式在本仓库的示例里使用 annotation:
|
||||
- `traefik.ingress.kubernetes.io/router.middlewares: default-stripprefix-mx@kubernetescrd`
|
||||
|
||||
### 2.2 IngressRoute(用于 M2 / M4)
|
||||
|
||||
- 采用 `spec.entryPoints` + `spec.routes`,并使用 `match`:
|
||||
- `entryPoints: [web]`
|
||||
- `match: PathPrefix(\`/demo-mx\`)`
|
||||
- `services` 指向对应 Service
|
||||
- `middlewares` 在 CRD 内直接引用(`name: stripprefix-mx`)
|
||||
|
||||
---
|
||||
|
||||
## 3. `/demo-mx` 路由结构(为什么还要 Middleware)
|
||||
|
||||
每个 Mx 都有一个 `Middleware`(`kind: Middleware`)做 `stripPrefix`:
|
||||
- 前缀例如 `/demo-m1`
|
||||
- 去掉前缀后,nginx 才能把请求当成 `/` 来返回 `index.html`
|
||||
|
||||
因此在本系列里:
|
||||
- 你访问 `http://<入口节点IP>/demo-m1/` 会先经过 `stripPrefix`
|
||||
- ngnix 最终拿到的是 `/`
|
||||
- 页面中会显示本场景的标识(M1/M2/M3/M4),便于你确认落点与路由是否正确
|
||||
|
||||
---
|
||||
|
||||
## 4. 本系列的“共同资源”你应该知道怎么删/怎么验证
|
||||
|
||||
所有 M1~M4 都在同一个命名空间 `default`:
|
||||
- Pod/Service:`kubectl get pod,svc -n default ...`
|
||||
- Ingress/IngressRoute:分别 `kubectl get ing -n default` 或 `kubectl get ingressroute -n default`
|
||||
|
||||
通用删除建议使用 manifests 目录(一键清理同一个场景):
|
||||
|
||||
```bash
|
||||
kubectl delete -f ansible/files/nginx-matrix/ -R
|
||||
```
|
||||
|
||||
或按具体文件删单个场景(见各分篇的 `## 删除` 小节)。
|
||||
|
||||
---
|
||||
|
||||
## 5. 推荐阅读顺序
|
||||
|
||||
1. `02-00-nginx-系列说明.md`(本页)
|
||||
2. `02-01~02-04`(按你关心的节点落点/路由对象读)
|
||||
3. `02-05-nginx-验证矩阵-一键部署.md`(最终整合、一次验证 4 个场景)
|
||||
|
||||
50
docs/02-01-nginx-control-ingress.md
Normal file
50
docs/02-01-nginx-control-ingress.md
Normal file
@@ -0,0 +1,50 @@
|
||||
# 02-01 Nginx + 控制节点 + Ingress(M1)
|
||||
|
||||
> 场景:nginx 落在控制节点(`nodeSelector: node-role.kubernetes.io/control-plane`),使用标准 Ingress 暴露 `/demo-m1`。整合于 `02-05-nginx-验证矩阵-一键部署.md`。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-02-k3s-工作节点.md`
|
||||
- 控制节点有 `node-role.kubernetes.io/control-plane` 标签
|
||||
- 入口节点 80 可达
|
||||
|
||||
## 操作步骤
|
||||
|
||||
1. 部署 nginx Deployment(nodeSelector 控制节点)+ Service
|
||||
2. 创建 Middleware + Ingress(`/demo-m1` -> nginx-m1:80)
|
||||
3. 等待 Pod 与 Ingress 就绪
|
||||
|
||||
示例 YAML 见 `ansible/files/nginx-matrix/01-control-ingress.yaml`。
|
||||
|
||||
## 部署命令
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nginx-matrix/01-control-ingress.yaml
|
||||
```
|
||||
|
||||
## 验证命令
|
||||
|
||||
```bash
|
||||
kubectl get pod,svc,ing -n default -o wide
|
||||
curl -i --max-time 3 http://<入口节点IP>/demo-m1/
|
||||
```
|
||||
|
||||
## 预期
|
||||
|
||||
- 返回 200,页面包含 Welcome to nginx!
|
||||
- Pod 落在控制节点(`kubectl get pod -o wide` 的 NODE 列为控制节点)
|
||||
|
||||
## 删除
|
||||
|
||||
```bash
|
||||
kubectl delete -f ansible/files/nginx-matrix/01-control-ingress.yaml
|
||||
```
|
||||
|
||||
## 失败排查
|
||||
|
||||
- 确认 Traefik 接管 Ingress、Service/Endpoint 正常
|
||||
- 参考 `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 下一步
|
||||
|
||||
- 返回 `02-05-nginx-验证矩阵-一键部署.md` 或 `00-00-构建总览.md`
|
||||
51
docs/02-02-nginx-control-ingressroute.md
Normal file
51
docs/02-02-nginx-control-ingressroute.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# 02-02 Nginx + 控制节点 + IngressRoute(M2)
|
||||
|
||||
> 场景:nginx 指定一台控制节点(`nodeSelector: kubernetes.io/hostname: ylc61`),路由使用 Traefik CRD `IngressRoute`,暴露 `/demo-m2`。整合于 `02-05-nginx-验证矩阵-一键部署.md`。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-02-k3s-工作节点.md`
|
||||
- Traefik CRD 可用
|
||||
- 按实际控制节点名(如 ylc61)修改 `nodeSelector` 中的 hostname
|
||||
|
||||
## 操作步骤
|
||||
|
||||
1. 部署 nginx Deployment(nodeSelector 指定控制节点 hostname)+ Service
|
||||
2. 创建 Middleware + IngressRoute(`PathPrefix(/demo-m2)`)
|
||||
3. 等待资源就绪
|
||||
|
||||
示例 YAML 见 `ansible/files/nginx-matrix/02-control-ingressroute.yaml`。
|
||||
|
||||
## 部署命令
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nginx-matrix/02-control-ingressroute.yaml
|
||||
```
|
||||
|
||||
## 验证命令
|
||||
|
||||
```bash
|
||||
kubectl get pod,svc -n default -o wide
|
||||
kubectl get ingressroute -n default
|
||||
curl -i --max-time 3 http://<入口节点IP>/demo-m2/
|
||||
```
|
||||
|
||||
## 预期
|
||||
|
||||
- 返回 200,页面包含 Welcome to nginx!
|
||||
- Pod 落在指定控制节点
|
||||
|
||||
## 删除
|
||||
|
||||
```bash
|
||||
kubectl delete -f ansible/files/nginx-matrix/02-control-ingressroute.yaml
|
||||
```
|
||||
|
||||
## 失败排查
|
||||
|
||||
- 确认 CRD 已存在、Traefik 日志无路由错误
|
||||
- 参考 `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 下一步
|
||||
|
||||
- 返回 `02-05-nginx-验证矩阵-一键部署.md` 或 `00-00-构建总览.md`
|
||||
50
docs/02-03-nginx-worker-ingress.md
Normal file
50
docs/02-03-nginx-worker-ingress.md
Normal file
@@ -0,0 +1,50 @@
|
||||
# 02-03 Nginx + 工作节点 + Ingress(M3)
|
||||
|
||||
> 场景:nginx 随机一台工作节点(`nodeSelector: node-role.kubernetes.io/worker: ""`),跨节点 Ingress 暴露 `/demo-m3`。整合于 `02-05-nginx-验证矩阵-一键部署.md`。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-02-k3s-工作节点.md`
|
||||
- 工作节点有 `node-role.kubernetes.io/worker` 标签
|
||||
- 工作节点网络连通(8472/udp、firewalld 基线)
|
||||
|
||||
## 操作步骤
|
||||
|
||||
1. 部署 nginx Deployment(nodeSelector 工作节点标签,随机调度)+ Service
|
||||
2. 创建 Middleware + Ingress(`/demo-m3` -> nginx-m3:80)
|
||||
3. 等待资源就绪
|
||||
|
||||
示例 YAML 见 `ansible/files/nginx-matrix/03-worker-ingress.yaml`。
|
||||
|
||||
## 部署命令
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nginx-matrix/03-worker-ingress.yaml
|
||||
```
|
||||
|
||||
## 验证命令
|
||||
|
||||
```bash
|
||||
kubectl get pod,svc,ing -n default -o wide
|
||||
curl -i --max-time 3 http://<入口节点IP>/demo-m3/
|
||||
```
|
||||
|
||||
## 预期
|
||||
|
||||
- 返回 200,页面包含 Welcome to nginx!
|
||||
- Pod 落在任一工作节点(随机调度)
|
||||
|
||||
## 删除
|
||||
|
||||
```bash
|
||||
kubectl delete -f ansible/files/nginx-matrix/03-worker-ingress.yaml
|
||||
```
|
||||
|
||||
## 失败排查
|
||||
|
||||
- 检查 8472/udp、firewalld 转发(flannel.1/cni0 trusted)
|
||||
- 参考 `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 下一步
|
||||
|
||||
- 返回 `02-05-nginx-验证矩阵-一键部署.md` 或 `00-00-构建总览.md`
|
||||
53
docs/02-04-nginx-worker-ingressroute.md
Normal file
53
docs/02-04-nginx-worker-ingressroute.md
Normal file
@@ -0,0 +1,53 @@
|
||||
# 02-04 Nginx + 工作节点 + IngressRoute(M4)
|
||||
|
||||
> 场景:nginx 指定落在 ylc64(`nodeSelector: kubernetes.io/hostname: ylc64`),跨节点 IngressRoute 暴露 `/demo-m4`。整合于 `02-05-nginx-验证矩阵-一键部署.md`。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-02-k3s-工作节点.md`
|
||||
- Traefik CRD 可用
|
||||
- 工作节点网络连通
|
||||
- 按实际工作节点名(如 ylc64)修改 `nodeSelector` 中的 hostname
|
||||
|
||||
## 操作步骤
|
||||
|
||||
1. 部署 nginx Deployment(nodeSelector 指定工作节点 hostname)+ Service
|
||||
2. 创建 Middleware + IngressRoute(`PathPrefix(/demo-m4)`)
|
||||
3. 等待资源就绪
|
||||
|
||||
示例 YAML 见 `ansible/files/nginx-matrix/04-worker-ingressroute.yaml`。
|
||||
|
||||
## 部署命令
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nginx-matrix/04-worker-ingressroute.yaml
|
||||
```
|
||||
|
||||
## 验证命令
|
||||
|
||||
```bash
|
||||
kubectl get pod,svc -n default -o wide
|
||||
kubectl get ingressroute -n default
|
||||
curl -i --max-time 3 http://<入口节点IP>/demo-m4/
|
||||
```
|
||||
|
||||
## 预期
|
||||
|
||||
- 返回 200,页面包含 Welcome to nginx!
|
||||
- Pod 落在指定工作节点
|
||||
|
||||
## 删除
|
||||
|
||||
```bash
|
||||
kubectl delete -f ansible/files/nginx-matrix/04-worker-ingressroute.yaml
|
||||
```
|
||||
|
||||
## 失败排查
|
||||
|
||||
- 优先看 Traefik 日志
|
||||
- 检查节点间网络转发、firewalld 基线
|
||||
- 参考 `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 下一步
|
||||
|
||||
- 返回 `02-05-nginx-验证矩阵-一键部署.md` 或 `00-00-构建总览.md`
|
||||
202
docs/02-05-nginx-验证矩阵-一键部署.md
Normal file
202
docs/02-05-nginx-验证矩阵-一键部署.md
Normal file
@@ -0,0 +1,202 @@
|
||||
# 02-05 Nginx 验证矩阵(Ingress / IngressRoute)— 综合一键部署
|
||||
|
||||
> **定位**:02 系列尾部,整合 02-01~02-04 的综合一键部署。4 种组合(控制节点/工作节点 × Ingress/IngressRoute)均有具体 Deployment + Service + 路由,节点 IP 访问(如 `http://入口IP/demo-m1/`)。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-02-k3s-工作节点.md`(Traefik 与 LB 可用)
|
||||
- **M1**:Deployment 需含 `nodeSelector: node-role.kubernetes.io/control-plane: ""` 及控制平面污点的 **toleration**(见下 YAML),否则控制节点若有 `NoSchedule` 污点会导致 Pod 一直 Pending、访问 /demo-m1 报 "no available server";若控制节点无该标签,需先打标或改 M1 为 hostname(同 M2)
|
||||
- **M2**:Deployment 必须含 `nodeSelector: kubernetes.io/hostname: <控制节点名>`(示例 `ylc61`),按实际修改
|
||||
- **M3**:Deployment 含 `nodeSelector: node-role.kubernetes.io/worker: ""`,随机调度到任一工作节点;需工作节点有该标签
|
||||
- **M4**:Deployment 必须含 `nodeSelector: kubernetes.io/hostname: ylc64`(指定工作节点),按实际修改
|
||||
|
||||
## 部署说明(Pod、路径)
|
||||
|
||||
|
||||
| 场景 | 路径 | 说明 |
|
||||
| ---------------------- | ---------- | ----------------- |
|
||||
| M1 控制节点 + Ingress | `/demo-m1` | nginx-m1,随机一台控制节点 |
|
||||
| M2 控制节点 + IngressRoute | `/demo-m2` | nginx-m2,指定一台控制节点 |
|
||||
| M3 工作节点 + Ingress | `/demo-m3` | nginx-m3,随机一台工作节点 |
|
||||
| M4 工作节点 + IngressRoute | `/demo-m4` | nginx-m4,指定一台工作节点 |
|
||||
|
||||
|
||||
## 完整配置(与 Ansible 共用)
|
||||
|
||||
配置位于 `ansible/files/nginx-matrix/`(4 个文件对应 M1~M4),文档与 Ansible 共用此目录:
|
||||
|
||||
| 文件 | 场景 | 路径 | 节点 |
|
||||
|------|------|------|------|
|
||||
| 01-control-ingress.yaml | M1 控制+Ingress | /demo-m1 | nodeSelector control-plane + toleration |
|
||||
| 02-control-ingressroute.yaml | M2 控制+IngressRoute | /demo-m2 | hostname 指定(默认 ylc61) |
|
||||
| 03-worker-ingress.yaml | M3 工作+Ingress | /demo-m3 | nodeSelector worker(随机) |
|
||||
| 04-worker-ingressroute.yaml | M4 工作+IngressRoute | /demo-m4 | hostname 指定(默认 ylc64) |
|
||||
|
||||
按需修改 **M2**(`ylc61`)、**M4**(`ylc64`)的 hostname;**M3** 需工作节点有 `node-role.kubernetes.io/worker` 标签。计算机名使用短主机名(ylc61~ylc64)便于配合 Cloudflare CDN。
|
||||
|
||||
## 部署
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nginx-matrix/ -R
|
||||
kubectl get pod,svc,ing,ingressroute -n default -o wide
|
||||
```
|
||||
|
||||
## 验证(用 IP 访问)
|
||||
|
||||
直接用入口节点 IP 访问(将 `192.168.2.61` 改为你的入口 IP;按 01-02/01-07 已配 LB 时任选节点 IP)。
|
||||
|
||||
```bash
|
||||
for path in demo-m1 demo-m2 demo-m3 demo-m4; do
|
||||
code=$(curl -s -o /dev/null -w "%{http_code}" --max-time 3 "http://192.168.2.61/${path}/" 2>/dev/null || echo "---")
|
||||
echo "/${path}/: ${code}"
|
||||
done
|
||||
```
|
||||
|
||||
预期:4 个路径均返回 `200`,页面分别显示 **M1**、**M2**、**M3**、**M4**(及对应说明、Backend: M1 等),便于区分是哪个后端。
|
||||
|
||||
### M1 仍显示 nginx 欢迎页时:进入容器排查
|
||||
|
||||
**若访问 /demo-m1 报 "no available server"**:多为 M1 Pod 未调度(Pending),Service 无 endpoint。控制节点常有 `node-role.kubernetes.io/control-plane:NoSchedule` 污点,Deployment 需加上述 **toleration** 才能调度。可用下面命令逐项检查:
|
||||
|
||||
```bash
|
||||
# 1. M1 Pod 是否存在、是否 Running(Pending 说明未调度)
|
||||
kubectl get pod -n default -l app=nginx-m1 -o wide
|
||||
|
||||
# 2. M1 Service 是否有 endpoint(无 endpoint 则 Traefik 报 no available server)
|
||||
kubectl get endpoints -n default nginx-m1
|
||||
|
||||
# 3. 若 Pod 为 Pending,看调度失败原因(Events 里会写 taint/未满足)
|
||||
kubectl describe pod -n default -l app=nginx-m1
|
||||
|
||||
# 4. 确认控制节点标签与污点(M1 需调度到带 control-plane 的节点,且 Deployment 需有对应 toleration)
|
||||
kubectl get nodes -o custom-columns=NAME:.metadata.name,LABEL:.metadata.labels.node-role\.kubernetes\.io/control-plane,TAINT:.spec.taints
|
||||
```
|
||||
|
||||
若 Events 为 **`node(s) didn't match Pod's node affinity/selector`**:说明没有任何节点带 `node-role.kubernetes.io/control-plane` 标签,需给控制节点打标(控制节点一般为运行 k3s server 的那台,如 ylc61):
|
||||
|
||||
```bash
|
||||
# 先看节点名:kubectl get nodes
|
||||
# 再给控制节点打标(把 ylc61 换成实际控制节点名)
|
||||
kubectl label node ylc61 node-role.kubernetes.io/control-plane= --overwrite
|
||||
```
|
||||
|
||||
打标后 M1 Pod 会自动调度到该节点并 Running;若仍 Pending,再执行一次 `kubectl get pod -n default -l app=nginx-m1 -o wide` 与 `kubectl describe pod -n default -l app=nginx-m1` 看 Events。
|
||||
|
||||
下面为"M1 能访问但仍是默认页"时的容器内排查。在**已配置 KUBECONFIG 的机器**上执行(将 `KUBECONFIG` 改为实际路径,如 `/etc/rancher/k3s/k3s.yaml`):
|
||||
|
||||
```bash
|
||||
# 1. 看 html 目录下有哪些文件(是否包含我们挂的 index.html)
|
||||
kubectl exec -n default deployment/nginx-m1 -- ls -la /usr/share/nginx/html/
|
||||
|
||||
# 2. 看 index.html 内容(应是 M1 的 HTML,若仍是 Welcome to nginx 说明挂载未生效)
|
||||
kubectl exec -n default deployment/nginx-m1 -- cat /usr/share/nginx/html/index.html
|
||||
|
||||
# 3. 看 conf.d 下有哪些配置(是否有多个 server,或 default.conf 被覆盖)
|
||||
kubectl exec -n default deployment/nginx-m1 -- ls -la /etc/nginx/conf.d/
|
||||
|
||||
# 4. 看 default.conf 内容(应含 root、index、X-Backend M1)
|
||||
kubectl exec -n default deployment/nginx-m1 -- cat /etc/nginx/conf.d/default.conf
|
||||
|
||||
# 5. 看 nginx 最终生效的 server 配置(确认谁在监听 80、root 指向哪)
|
||||
kubectl exec -n default deployment/nginx-m1 -- nginx -T 2>&1 | grep -A 200 "server {"
|
||||
```
|
||||
|
||||
**根据输出可判断**:若 `index.html` 仍是默认欢迎页 → ConfigMap/volumeMount 未生效或未覆盖到该路径;若 `conf.d/` 下有多于一个 `*.conf` 或 `default.conf` 不是我们的内容 → 配置被覆盖或未挂载;若 `nginx -T` 里 80 端口的 `root` 不是 `/usr/share/nginx/html` 或没有我们的 `location` → 被其他 server 块优先。把上述命令的输出贴出后,即可针对性改 manifest 或挂载方式。
|
||||
|
||||
**为何 M1~M4 都是单文件(subPath)挂载,却只有 M1 不正常?**
|
||||
Manifest 里四份写法一致,若只有 M1 仍显示默认页,多半是集群里 nginx-m1 的 Deployment/ReplicaSet 曾用旧 spec 部署过,导致当前 Pod 未带 volumeMount。根因未查清时,最稳妥是**删除 M1 部署再重新部署**(见下)。
|
||||
|
||||
**单文件部署(按目录 apply 全部 M1~M4)**:
|
||||
|
||||
```bash
|
||||
# 在仓库根目录执行时:
|
||||
kubectl apply -f ansible/files/nginx-matrix/ -R
|
||||
|
||||
# 若当前在 ansible/ 目录下,改用:
|
||||
kubectl apply -f files/nginx-matrix/ -R
|
||||
```
|
||||
|
||||
**M1 未生效时:删除部署再重新部署(推荐)**
|
||||
|
||||
先删 M1 的 Deployment,再 apply 01,这样会新建唯一的 ReplicaSet,不会留下旧 spec 的 Pod。
|
||||
|
||||
```bash
|
||||
# 1. 只删 M1 的 Deployment(Service/Ingress/Middleware/ConfigMap 保留,稍后 apply 会复用或更新)
|
||||
kubectl delete deployment nginx-m1 -n default
|
||||
|
||||
# 2. 重新部署 M1(在 ansible/ 目录下)
|
||||
kubectl apply -f files/nginx-matrix/01-control-ingress.yaml
|
||||
|
||||
# 若在仓库根目录:
|
||||
# kubectl apply -f ansible/files/nginx-matrix/01-control-ingress.yaml
|
||||
|
||||
# 3. 等 Pod Running 后验证
|
||||
kubectl get pod -n default -l app=nginx-m1
|
||||
kubectl exec -n default deployment/nginx-m1 -- cat /usr/share/nginx/html/index.html
|
||||
kubectl exec -n default deployment/nginx-m1 -- cat /etc/nginx/conf.d/default.conf
|
||||
```
|
||||
|
||||
使用 Ansible 部署时,playbook 会自动跑一遍上述诊断并打印「M1 容器内诊断结果」,便于直接查看。
|
||||
|
||||
## Ansible 一键部署
|
||||
|
||||
可使用 Ansible playbook 自动完成复制 manifests、apply、等待 Pod 就绪及 curl 验证:
|
||||
|
||||
- **Playbook**:`ansible/playbooks/nginx-matrix-deploy.yml`
|
||||
- **Manifests 位置**:`ansible/files/nginx-matrix/`(M1 control-plane / M2 M4 节点名 ylc61、ylc64,M3 worker;按实际修改 M2/M4 节点名)
|
||||
- **执行(在 ansible/ 目录下)**:
|
||||
|
||||
```bash
|
||||
cd ansible
|
||||
ansible-playbook -i inventory.ini playbooks/nginx-matrix-deploy.yml
|
||||
```
|
||||
|
||||
若 manifests 目录未找到,可改为在仓库根目录执行:
|
||||
|
||||
```bash
|
||||
ansible-playbook -i ansible/inventory.ini ansible/playbooks/nginx-matrix-deploy.yml
|
||||
```
|
||||
|
||||
Playbook 会:拷贝 manifests 到控制节点 → **先删除全部 nginx 矩阵 Deployment**(nginx-m1~m4,若存在)→ `kubectl apply` → `rollout restart` M1~M4 → 等待 Pod 就绪 → 输出 16 个目标(4 节点 × 4 路径)的 curl 矩阵。先删再 apply 可避免旧 ReplicaSet 导致任一 Mx 仍显示默认页。
|
||||
|
||||
## 删除
|
||||
|
||||
**手动 kubectl apply 的**:用同一目录删除
|
||||
|
||||
```bash
|
||||
kubectl delete -f ansible/files/nginx-matrix/ -R
|
||||
```
|
||||
|
||||
**Ansible playbook 部署的**:在仓库根或 ansible 同级的机器上,用 manifests 删除(需配置 KUBECONFIG)
|
||||
|
||||
```bash
|
||||
export KUBECONFIG=/etc/rancher/k3s/k3s.yaml # 或从控制节点拷贝 kubeconfig
|
||||
kubectl delete -f ansible/files/nginx-matrix/ -R
|
||||
```
|
||||
|
||||
若控制节点上 `/tmp/nginx-matrix/` 仍存在,也可在控制节点执行:
|
||||
|
||||
```bash
|
||||
sudo kubectl delete -f /tmp/nginx-matrix/ -R
|
||||
```
|
||||
|
||||
**按资源名删除**(适用于 manifests 已不可用)
|
||||
|
||||
```bash
|
||||
kubectl delete deployment,svc,ingress -n default nginx-m1 nginx-m2 nginx-m3 nginx-m4
|
||||
kubectl delete ingressroute -n default nginx-m2 nginx-m4
|
||||
kubectl delete middleware -n default stripprefix-m1 stripprefix-m2 stripprefix-m3 stripprefix-m4
|
||||
kubectl delete configmap -n default nginx-m1-html nginx-m2-html nginx-m3-html nginx-m4-html
|
||||
```
|
||||
|
||||
## 下一步
|
||||
|
||||
- `03-01-k3s-traefik-dashboard.md`:Dashboard
|
||||
- `03-02-k3s-traefik-acme.md`
|
||||
- 返回 `00-00-构建总览.md` 按导航继续
|
||||
|
||||
## 相关文档
|
||||
|
||||
- `02-01`~`02-04`:分篇说明(路径 /demo-m1~m4、nodeSelector 与本文一致)
|
||||
- `03-02-k3s-traefik-acme.md`
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
93
docs/03-01-k3s-traefik-dashboard.md
Normal file
93
docs/03-01-k3s-traefik-dashboard.md
Normal file
@@ -0,0 +1,93 @@
|
||||
# 03-02-k3s Traefik Dashboard
|
||||
|
||||
> 启用并访问 Traefik Dashboard,用于查看路由与服务状态。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- Traefik 已正常运行
|
||||
- 已了解 Dashboard 仅用于运维,不建议公网裸露
|
||||
|
||||
## 部署说明(几个 Pod?配置存哪?如何同步?)
|
||||
|
||||
- **几个 Pod**:K3s 默认 Traefik 是 **Deployment、replicas=1**,即只有 **1 个 Traefik Pod**。该 Pod 可能跑在控制节点或你打了 Traefik 入口标签的任意节点上。
|
||||
- **为何每个节点都能访问**:流量不是“每个节点一个 Traefik”。每个节点上的 80 端口由 K3s 的 **ServiceLB(svclb-traefik)** 监听,请求被转发到 **同一个** Traefik Service,再转到上述那 1 个 Traefik Pod。所以多节点能访问是 LB 转发到同一后端,不是每节点一个 Traefik。
|
||||
- **配置存在哪里**:**HelmChartConfig** 与 **IngressRoute** 是 Kubernetes 资源,存在 **etcd**(控制节点)。Traefik 进程通过 **Kubernetes API** 监听 Ingress/IngressRoute 等,动态生成路由,**不需要在多个 Pod 之间同步配置**。若以后把 Traefik 扩成多副本,所有副本都从同一 API 读到的资源,行为一致。
|
||||
|
||||
## 操作步骤
|
||||
|
||||
1. 在控制节点创建 `traefik-dashboard.yaml`,放入 K3s manifests 目录(K3s 启动时自动加载,重启后无需手动 apply):
|
||||
|
||||
- **默认路径**:`/var/lib/rancher/k3s/server/manifests/traefik-dashboard.yaml`
|
||||
- **自定义 data-dir**(如 `--data-dir=/storage`):`<data-dir>/server/manifests/traefik-dashboard.yaml`
|
||||
|
||||
**唯一真源(勿与文档内联重复)**:[HelmChartConfig + IngressRoute 完整 YAML](../../ansible/files/traefik-dashboard/traefik-dashboard.yaml)。复制到上述 manifests 路径,或在仓库根执行:
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/traefik-dashboard/traefik-dashboard.yaml
|
||||
```
|
||||
|
||||
2. 应用配置并等待 Traefik 重载(按实际路径选择其一复制执行):
|
||||
|
||||
```bash
|
||||
# 默认路径
|
||||
kubectl apply -f /var/lib/rancher/k3s/server/manifests/traefik-dashboard.yaml
|
||||
kubectl -n kube-system rollout status deploy/traefik
|
||||
```
|
||||
|
||||
```bash
|
||||
# 自定义 data-dir(如 /storage)
|
||||
kubectl apply -f /storage/server/manifests/traefik-dashboard.yaml
|
||||
kubectl -n kube-system rollout status deploy/traefik
|
||||
```
|
||||
|
||||
3. 验证:一键对全部节点 IP 做 curl 测试(按实际环境修改 IP 列表):
|
||||
|
||||
```bash
|
||||
# 已按 01-02 / 01-07 配置 K3s 默认 LB(Traefik 入口标签 + firewalld 基线),61~64 任一台 :80 均应返回 200/307
|
||||
for ip in 192.168.2.61 192.168.2.62 192.168.2.63 192.168.2.64; do
|
||||
code=$(curl -s -o /dev/null -w "%{http_code}" --max-time 3 "http://${ip}/dashboard/" 2>/dev/null || echo "---")
|
||||
echo "${ip}: ${code}"
|
||||
done
|
||||
```
|
||||
|
||||
查看 Traefik 日志(确认无报错):
|
||||
|
||||
```bash
|
||||
kubectl -n kube-system logs deploy/traefik --tail=50
|
||||
```
|
||||
|
||||
可选:只看响应头(单节点)
|
||||
`curl -I --max-time 5 http://192.168.2.61/dashboard/`
|
||||
|
||||
## 删除部署与文件
|
||||
|
||||
因同一 chart 只能有一份 HelmChartConfig,后续做 03-03(ACME)、03-04(Dashboard+ACME 合并)等测试前,建议先删除本部署并删掉 manifest 文件,避免被覆盖或重复加载。
|
||||
|
||||
1. **删除集群内资源**(HelmChartConfig + IngressRoute):
|
||||
|
||||
```bash
|
||||
kubectl delete -f /var/lib/rancher/k3s/server/manifests/traefik-dashboard.yaml
|
||||
kubectl -n kube-system rollout status deploy/traefik
|
||||
```
|
||||
|
||||
```bash
|
||||
kubectl delete -f /storage/server/manifests/traefik-dashboard.yaml
|
||||
kubectl -n kube-system rollout status deploy/traefik
|
||||
```
|
||||
|
||||
2. **删除宿主机上的 manifest 文件**(否则 K3s 重启会再次加载):
|
||||
|
||||
```bash
|
||||
# 默认路径
|
||||
sudo rm -f /var/lib/rancher/k3s/server/manifests/traefik-dashboard.yaml
|
||||
```
|
||||
|
||||
```bash
|
||||
# 自定义 data-dir(如 /storage)
|
||||
sudo rm -f /storage/server/manifests/traefik-dashboard.yaml
|
||||
```
|
||||
|
||||
## 下一步
|
||||
|
||||
- `04-03-k3s-nginx-demo.md`
|
||||
- `04-01-k3s-nodejs-高级部署.md`
|
||||
384
docs/03-02-k3s-traefik-acme.md
Normal file
384
docs/03-02-k3s-traefik-acme.md
Normal file
@@ -0,0 +1,384 @@
|
||||
# 03-02-k3s Traefik ACME
|
||||
|
||||
> **状态:✅ 验证已完成**(2026-03,K3s 4 节点 ylc61~ylc64,Cloudflare DNS-01、Let’s Encrypt 证书、TLS 矩阵 `test01~test04.jackadam.top`,HTTPS 与 HTTP-only 各 16 目标均 200;Ansible `nginx-matrix-tls-deploy.yml` 已实机跑通。)
|
||||
|
||||
> 为 Traefik 配置 ACME 自动证书(Let's Encrypt + Cloudflare DNS 验证),并部署 **TLS 矩阵**。
|
||||
>
|
||||
> **为 02-05 的升级版**:02-05 为 HTTP-only(节点 IP、无域名);本页在其基础上增加 ACME 证书、域名、根路径 `/`,用于有域名时的学习或生产。
|
||||
|
||||
---
|
||||
|
||||
## 前置条件
|
||||
|
||||
- Traefik 已可用(见 `01-02-k3s-工作节点.md`)
|
||||
- 域名托管在 Cloudflare
|
||||
- 已获取 Cloudflare API Token(最小权限)
|
||||
- **调度与标签要求同 02-05 矩阵**(M1~M4 仍是控制/工作 × Ingress/IngressRoute 四类):M1 需控制节点带 `node-role.kubernetes.io/control-plane` 标签且 Deployment 含 toleration,否则 M1 Pod 会 Pending。若未打标,排障思路与 `02-05-nginx-验证矩阵-一键部署.md` 中 M1 Pending 相同;**本页访问域名为 HTTPS/HTTP 根路径,不是 `/demo-mx`**。
|
||||
|
||||
---
|
||||
|
||||
## 部署说明(Pod、部署方式、配置与存储)
|
||||
|
||||
- **Pod / 部署**:ACME 配置通过 `HelmChartConfig` 注入到 **同一个 Traefik Deployment**。**副本数为 chart 默认值 1**(即 `deployment.replicas` 未在 values 里写时默认为 1),所以只有 1 个 Traefik Pod;与 03-01 的 Traefik 是同一套 Deployment,只是 values 里多了 ACME 参数与 env。
|
||||
- **配置存在哪里**:`HelmChartConfig` 存在 **etcd**(控制节点);K3s 的 chart 控制器据此更新 Traefik 的部署参数,Traefik 进程从 **Kubernetes API** 读取 Ingress/IngressRoute,无需多 Pod 间同步。
|
||||
- **ACME 存储(证书与账户)**:`acme.storage` 指向容器内 **`/data/acme.json`**。未配 hostPath 时,K3s 默认会为 Traefik 挂载卷到 `/data`(如 emptyDir 或默认持久卷),**仅当前这一个 Traefik Pod 可写**,Pod 重建后若卷不持久则需重新申请证书。若在 values 里配置了 **hostPath**(见本页可选配置),则 `/data` 对应宿主机目录,证书写在物理机路径,便于备份与复用;Traefik 仍为 1 个 Pod,不存在多副本间同步 acme.json 的问题。
|
||||
- **第一次部署随机节点、重启后怎么办**:Traefik 未指定 nodeSelector 时,首次会**随机调度**到某一节点。若使用了 **hostPath**,证书只存在于该节点的磁盘上;**Pod 被调度到其他节点**(重启、驱逐、缩容再扩容)时,新节点上的同名 hostPath 是另一块盘,**证书不会跟着走**,可能需重新申请。若希望重启或节点故障后仍保留证书,可:**① 把 Traefik 固定到某一节点**(在 HelmChartConfig 的 `deployment` 下配 `nodeSelector`,例如 `nodeSelector: { kubernetes.io/hostname: ylc61 }(节点名使用短主机名 ylc61~ylc64,便于配合 Cloudflare CDN)`),使 hostPath 始终落在同一台机;**② 或不用 hostPath**,依赖 K3s 默认持久卷(若为 local-path,则卷仍绑定某节点,Pod 重建到同节点可复用);**③ 或改用 NFS 等共享存储**挂到 `/data`,多节点可读同一证书(需自行在 values 里配 PVC/volume)。
|
||||
|
||||
---
|
||||
|
||||
## 创建 Secret
|
||||
|
||||
```bash
|
||||
# 首次创建(不存在时)
|
||||
kubectl -n kube-system create secret generic cloudflare-api-token \
|
||||
--from-literal=api-token='<YOUR_CLOUDFLARE_API_TOKEN>'
|
||||
|
||||
# 若需更新 Token,可先删除再重建(不会影响已签发的证书,只影响后续申请/续期)
|
||||
kubectl -n kube-system delete secret cloudflare-api-token --ignore-not-found=true
|
||||
kubectl -n kube-system create secret generic cloudflare-api-token \
|
||||
--from-literal=api-token='<YOUR_NEW_CLOUDFLARE_API_TOKEN>'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Token 与 Secret 验证(建议强制做)
|
||||
|
||||
> 目的:确认 Cloudflare API Token 对 `jackadam.top`(你的 zone)是可用且“可查 zone / 可编辑 DNS”的,避免 ACME 只报失败但根因不明。
|
||||
|
||||
### 1)本机验证 Token 是否能查到 zone
|
||||
|
||||
在任意联网环境的终端运行(Token 不要手工写死到文档里,下面用占位符):
|
||||
|
||||
```bash
|
||||
TOKEN='<YOUR_CLOUDFLARE_API_TOKEN>'
|
||||
|
||||
curl -sS -X GET "https://api.cloudflare.com/client/v4/zones?name=jackadam.top" \
|
||||
-H "Authorization: Bearer $TOKEN" \
|
||||
-H "Content-Type: application/json" | jq '.success,.errors'
|
||||
```
|
||||
|
||||
期望输出:
|
||||
|
||||
- `.success` 为 `true`
|
||||
- `.errors` 为空数组 `[]`
|
||||
|
||||
如果 `.success=false`,通常会看到类似 `9109: Invalid access token` 或 `403`,这意味着 Token 依然不满足 Cloudflare API 的权限要求。
|
||||
|
||||
> 如果你的机器没有 `jq`,可以先把 `| jq '.success,.errors'` 去掉看原始 JSON。
|
||||
|
||||
### 2)集群内验证 Secret 是否写入了该 Token
|
||||
|
||||
用“哈希对比”的方式校验(只输出哈希,不会直接泄露 Token 原文):
|
||||
|
||||
```bash
|
||||
# 本机 Token 的 sha256
|
||||
TOKEN='<YOUR_CLOUDFLARE_API_TOKEN>'
|
||||
printf "%s" "$TOKEN" | sha256sum
|
||||
|
||||
# 集群 Secret 解码后的 sha256(仅输出哈希)
|
||||
kubectl -n kube-system get secret cloudflare-api-token \
|
||||
-o jsonpath='{.data.api-token}' | base64 -d | sha256sum
|
||||
```
|
||||
|
||||
若两行 `sha256sum` 结果一致,说明集群里的 Secret 已经写入为你当前测试的 Token。
|
||||
|
||||
---
|
||||
|
||||
## 配置 HelmChartConfig
|
||||
|
||||
> **重要**:同一 chart 只能有一份 `HelmChartConfig`(如 `name: traefik`)。若已按 03-01 部署了 Dashboard,再单独 apply 本文件的配置会**覆盖**掉 03-01,Dashboard 会失效。此时应二选一:**要么**使用 `03-03-k3s-traefik-dashboard-acme.md` 中的合并 YAML(Dashboard + ACME 一份搞定),**要么**把本页的 ACME 配置合并进已有 03-01 的 `traefik-dashboard.yaml`,只保留一个 manifest 文件。
|
||||
>
|
||||
> **文件选择**:K3s 自带的 `traefik.yaml` 会被 K3s 覆盖,**不要修改**。所有自定义配置(ACME、nodeSelector、hostPath 以及其他扩展配置)都应写在 **`traefik-acme.yaml`** 这一份 HelmChartConfig 里,与默认 chart 合并生效。
|
||||
|
||||
1. 在控制节点创建 `traefik-acme.yaml`,推荐放入 K3s manifests 目录(路径同 03-01)。**完整配置见 `ansible/files/traefik-acme/traefik-acme.yaml`**(与 Ansible 共用),复制后替换 `<YOUR_REAL_EMAIL>` 等占位符即可。
|
||||
|
||||
> 将 `<YOUR_REAL_EMAIL>` 改为你的邮箱。`/data/acme.json` 为容器内路径;`caserver` 为测试服务器(staging),正式上线前删除该行即切回生产 CA。Traefik 在容器内监听 8000/8443,由 Service 和 svclb 映射到节点 80/443。
|
||||
>
|
||||
> **Ping / PROXY protocol**:`--ping.entryPoint=websecure` 使 `GET /ping` 在 443 端口(HTTPS)返回 200,适合生产仅暴露 TLS、HAProxy 对 443 做健康探测(`option httpchk GET /ping` + `http-check send meth GET uri /ping ssl`)。Traefik 的 ping 一次只能指定一个 entrypoint;若 HAProxy 用 HTTP 探测 80 端口(内网/测试),可改为 `--ping.entryPoint=web`。`trustedIPs` 需包含 HAProxy 所在 IP;若 HAProxy 在前置路由器(如 192.168.2.1),需在 HAProxy 后端加 `send-proxy-v2`,否则 Traefik 无法解析 PROXY 头。
|
||||
>
|
||||
> **容器外(节点 80/443)在哪里设置?** 由 K3s 自带的 Traefik chart 的 Service 定义,chart 里默认已配好 80→8000、443→8443,一般无需改。若要改对外端口(例如节点用 8080 代替 80),可在 `valuesContent` 里增加 `ports` 覆盖,例如:`ports.web.expose.exposedPort: 8080`、`ports.websecure.expose.exposedPort: 8443`(与 `ports.web.port` / `ports.websecure.port` 区分:前者是节点/svc 暴露端口,后者是容器内端口)。
|
||||
>
|
||||
> **节点在 80/443 上的对外暴露在哪里设置?** 由 K3s 自带的 Traefik chart 的 Service 定义,chart 里默认已配好 80→8000、443→8443,一般无需改;若要改对外端口,请参考上文关于 `ports` 的说明。
|
||||
|
||||
2. 应用配置并等待 Traefik 重载(按实际路径选择其一复制执行):
|
||||
|
||||
```bash
|
||||
# 默认路径
|
||||
kubectl apply -f /var/lib/rancher/k3s/server/manifests/traefik-acme.yaml
|
||||
kubectl -n kube-system rollout status deploy/traefik
|
||||
kubectl -n kube-system logs deploy/traefik --tail=100 | grep -i acme || true
|
||||
```
|
||||
|
||||
```bash
|
||||
# 自定义 data-dir(如 /storage)
|
||||
kubectl apply -f /storage/server/manifests/traefik-acme.yaml
|
||||
kubectl -n kube-system rollout status deploy/traefik
|
||||
kubectl -n kube-system logs deploy/traefik --tail=100 | grep -i acme || true
|
||||
```
|
||||
|
||||
3. (可选)检查容器内 `acme.json` 是否生成:
|
||||
|
||||
```bash
|
||||
kubectl -n kube-system exec -it deploy/traefik -- sh -c 'ls -l /data/acme.json || true'
|
||||
```
|
||||
|
||||
> **关于持久化与备份(GitLab 等重状态服务)**:请统一使用持久化存储方案(例如 Longhorn)并制定备份策略。见 `03-07-k3s-longhorn-持久化存储.md`。
|
||||
|
||||
---
|
||||
|
||||
## 验证与使用顺序概览
|
||||
|
||||
> 1)按上节完成 **ACME 配置**(Secret + `traefik-acme.yaml`)。
|
||||
> 2)apply **本页下方的完整矩阵 YAML**(M1~M4:TLS 仅 `websecure` + 同域名 **HTTP-only** 仅 `web` 供内网/测试,与 02-05 的 `/demo-mx` 清单不同),Traefik 会自动发现并签发证书。
|
||||
> 矩阵的域名为 `test01.jackadam.top`~`test04.jackadam.top`(M1~M4 各一);M2/M4 的 hostname 按实际修改。
|
||||
> **可选**:也可使用 Ansible 一键部署,见下方「Ansible 一键部署(TLS 矩阵)」。
|
||||
|
||||
---
|
||||
|
||||
## 常见问题:ACME 报 DNS 解析错误的处理
|
||||
|
||||
> 现象:Traefik 日志中反复出现
|
||||
>
|
||||
> - `lookup acme-staging-v02.api.letsencrypt.org on 10.43.0.10:53: server misbehaving`
|
||||
> - 或 CoreDNS 日志中出现 `dial udp [240e:...]:53: connect: network is unreachable`
|
||||
>
|
||||
> 原因:宿主机启用了 IPv6 DNS,但 k3s 默认 flannel 网络只提供 IPv4 Pod 网络,CoreDNS 在 Pod 内访问 IPv6 上游 DNS 失败,导致 ACME 无法解析 Let’s Encrypt 域名。
|
||||
|
||||
**简单可靠的修法(保持集群 IPv4-only,仅修上游 DNS):**
|
||||
|
||||
1. 编辑 CoreDNS 的 ConfigMap,将上游 DNS 改为明确的 IPv4 地址:
|
||||
|
||||
```bash
|
||||
kubectl -n kube-system edit configmap coredns
|
||||
```
|
||||
|
||||
将 Corefile 中的
|
||||
|
||||
```txt
|
||||
forward . /etc/resolv.conf
|
||||
```
|
||||
|
||||
改为类似(按实际环境选择可用的 IPv4 DNS):
|
||||
|
||||
```txt
|
||||
forward . 114.114.114.114 8.8.8.8
|
||||
```
|
||||
|
||||
2. 重启 CoreDNS:
|
||||
|
||||
```bash
|
||||
kubectl -n kube-system rollout restart deploy/coredns
|
||||
kubectl -n kube-system rollout status deploy/coredns
|
||||
```
|
||||
|
||||
3. 在 Traefik Pod 内验证解析是否恢复正常:
|
||||
|
||||
```bash
|
||||
POD=$(kubectl -n kube-system get pod -l app.kubernetes.io/name=traefik -o jsonpath='{.items[0].metadata.name}')
|
||||
kubectl -n kube-system exec -it "$POD" -- nslookup acme-staging-v02.api.letsencrypt.org || \
|
||||
kubectl -n kube-system exec -it "$POD" -- nslookup acme-v02.api.letsigncrypt.org
|
||||
```
|
||||
|
||||
解析成功后,重新访问 `https://test01.jackadam.top`~`https://test04.jackadam.top`,Traefik 会重新尝试通过 ACME 申请证书,`openssl s_client` 输出中的 `issuer` 将不再是 `TRAEFIK DEFAULT CERT`。
|
||||
|
||||
> 若需要在 k3s 上完整打通 IPv6 / dual-stack(包括 Pod 级 IPv6 出网),通常需要使用支持 IPv6 的 CNI(如 Calico、Cilium)并重新设计网络,建议参考单独的 Calico/Cilium 双栈实验文档。
|
||||
|
||||
## TLS 矩阵清单(02-05 升级版)
|
||||
|
||||
> **唯一真源**:[`ansible/files/nginx-matrix-tls/`](../../ansible/files/nginx-matrix-tls/)(`01-control-ingress.yaml`~`04-worker-ingressroute.yaml`),与 [`ansible/playbooks/nginx-matrix-tls-deploy.yml`](../../ansible/playbooks/nginx-matrix-tls-deploy.yml) 共用;**本文不再内联整份 YAML**。
|
||||
|
||||
**相对 02-05 的差异摘要**:基于域名根路径 `/`;TLS 仅绑 `websecure`;含 HTTP-only(仅 `web`)路由;与 02-05 的 `/demo-mx` 为两套资源;M2/M4 节点名与域名请在清单内编辑。
|
||||
|
||||
**清单目录中每文件**已同时包含 TLS 与 HTTP-only:每个 Mx 块内先为 TLS 路由(仅 `websecure`),紧跟同域名的 HTTP-only 路由(仅 `web`,内网/测试用)。**HTTP 也 200** 指 `http://test01~test04.jackadam.top` 直接返回 200(不跳转 HTTPS、不 404),由上述 4 个 HTTP-only 资源实现:M1/M3 为 Ingress(`traefik.ingress.kubernetes.io/router.entrypoints: web`,无 `spec.tls`、无 certresolver),M2/M4 为 IngressRoute(`entryPoints: [web]`,无 `tls` 段),与 TLS 路由共用同一批 Deployment/Service。**生产仅暴露 TLS 时**可删除这些 HTTP-only 资源。
|
||||
|
||||
---
|
||||
|
||||
部署有两种方式,任选其一即可。
|
||||
|
||||
**方式一:使用仓库 YAML 目录(推荐与文档一致)**
|
||||
|
||||
1. 在仓库中编辑 [`ansible/files/nginx-matrix-tls/`](../../ansible/files/nginx-matrix-tls/) 内各文件(M2/M4 节点名、域名等)。
|
||||
2. 按 k3s 存储方案可将整个目录复制到控制节点 manifests,或直接在仓库根执行 `kubectl apply -f ansible/files/nginx-matrix-tls/ -R`(与 `01-01-k3s-控制节点含traefik.md` 存储路径说明一致)。
|
||||
|
||||
3. 清理示例(路径与 apply 时一致):
|
||||
```bash
|
||||
kubectl delete -f ansible/files/nginx-matrix-tls/ -R --ignore-not-found=true
|
||||
```
|
||||
或沿用下文按资源名删除。
|
||||
或按资源名删除(与路径无关):
|
||||
```bash
|
||||
kubectl delete deployment,svc -n default nginx-m1 nginx-m2 nginx-m3 nginx-m4 --ignore-not-found=true
|
||||
kubectl delete ingress -n default nginx-m1 nginx-m3 nginx-m1-http nginx-m3-http --ignore-not-found=true
|
||||
kubectl delete ingressroute -n default nginx-m2 nginx-m4 nginx-m2-http nginx-m4-http --ignore-not-found=true
|
||||
kubectl delete configmap -n default nginx-m1-html nginx-m2-html nginx-m3-html nginx-m4-html --ignore-not-found=true
|
||||
```
|
||||
|
||||
**方式二:使用仓库内 manifests + Ansible playbook**
|
||||
|
||||
- 直接使用仓库中已合并好的 4 个文件(每个 Mx 含 TLS + HTTP-only),在**仓库根目录**执行:
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nginx-matrix-tls/ -R
|
||||
```
|
||||
需保证当前环境已设置 KUBECONFIG 或 `kubectl` 已指向目标集群(例如在控制节点上或已配置远程 kubeconfig)。
|
||||
- 一键部署/清理推荐用 Playbook(会先删 02-05 残留、再 apply、并做就绪与 curl 验证):
|
||||
- 在 `ansible/` 目录下:`ansible-playbook -i inventory.ini playbooks/nginx-matrix-tls-deploy.yml`
|
||||
- 在仓库根目录:`ansible-playbook -i ansible/inventory.ini ansible/playbooks/nginx-matrix-tls-deploy.yml`
|
||||
- 清理:同上命令后加 `-e mode=cleanup`。
|
||||
|
||||
验证 HTTP 与 HTTPS 是否正常(将 `192.168.2.61 … 192.168.2.64` 按实际入口节点 IP 修改):
|
||||
|
||||
```bash
|
||||
# HTTP 验证(4 节点 × 4 域名,应均为 200)
|
||||
for ip in 192.168.2.61 192.168.2.62 192.168.2.63 192.168.2.64; do
|
||||
echo "=== 节点 $ip 上的 4 个域名 (HTTP) ==="
|
||||
for id in 1 2 3 4; do
|
||||
curl -sI "http://test0$id.jackadam.top/" --resolve "test0$id.jackadam.top:80:$ip" -o /dev/null -w "test0$id:%{http_code}\n" || echo "test0$id:fail"
|
||||
done
|
||||
echo ""
|
||||
done
|
||||
|
||||
# HTTPS 验证(4 节点 × 4 域名,应均为 200,证书为 Let's Encrypt)
|
||||
for ip in 192.168.2.61 192.168.2.62 192.168.2.63 192.168.2.64; do
|
||||
echo "=== 节点 $ip 上的 4 个域名 (HTTPS) ==="
|
||||
for id in 1 2 3 4; do
|
||||
curl -skI "https://test0$id.jackadam.top/" --resolve "test0$id.jackadam.top:443:$ip" -o /dev/null -w "test0$id:%{http_code}\n" || echo "test0$id:fail"
|
||||
done
|
||||
echo ""
|
||||
done
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
|
||||
若 ACME 与 Cloudflare 配置正确,Traefik 日志中可见证书申请成功;四个路径均返回 200,页面分别显示 **M1**、**M2**、**M3**、**M4**,便于区分是哪个后端。
|
||||
|
||||
**若 curl 报 “SSL certificate problem: self-signed certificate”**:说明 Traefik 未拿到 Let's Encrypt 证书,在用默认自签证书。按下面逐项排查:
|
||||
|
||||
1. **确认已部署 TLS 矩阵**(Ingress 需带 `spec.tls`、`host`、注解 `certresolver=cloudflare`):
|
||||
```bash
|
||||
kubectl get ingress -n default nginx-m1 -o yaml | grep -A5 "tls:\|host:\|certresolver"
|
||||
```
|
||||
若无 `tls` / `host` / `certresolver`,说明当前是 02-05 的非 TLS Ingress,需执行 `kubectl apply -f ansible/files/nginx-matrix-tls/ -R`(或跑 Ansible playbook `nginx-matrix-tls-deploy.yml`)。
|
||||
|
||||
2. **看 Traefik 是否尝试/成功申请证书**:
|
||||
```bash
|
||||
kubectl -n kube-system logs deploy/traefik --tail=500 | grep -iE "acme|Let's Encrypt|certificate|error"
|
||||
```
|
||||
若有 `obtained certificate` 则已签发;若有 `acme: error` 或 `failed to get certificate`,记下错误信息(常见:DNS 挑战超时、Cloudflare API Token 权限不足、域名未在 Cloudflare 托管)。
|
||||
|
||||
3. **确认 Cloudflare**:test01~test04.jackadam.top 的 zone 在 Cloudflare;Secret `cloudflare-api-token` 存在且 API Token 有 “Zone:DNS:Edit”;若用 staging CA 测试,通过后再删 `caserver` 行切回生产。
|
||||
|
||||
4. **本地验证时加 `-k`** 可忽略证书校验只看 HTTP 是否 200:`curl -sk -o /dev/null -w "%{http_code}" https://test01.jackadam.top/ --resolve test01.jackadam.top:443:192.168.2.61`。证书正常前可用 `-k` 做功能验证。
|
||||
|
||||
5. **日志出现 “Router uses a nonexistent certificate resolver certificateResolver=cloudflare”**:表示当前 Traefik **未加载 ACME 配置**,没有名为 `cloudflare` 的证书解析器,因此会用自签证书。需确认本页「配置 HelmChartConfig」中的 `traefik-acme.yaml` 已放到 K3s manifests 目录并已 apply,且 Traefik 已用新配置重载:
|
||||
```bash
|
||||
kubectl get helmchartconfig -n kube-system traefik -o yaml
|
||||
# 若存在且 valuesContent 中含 certificatesresolvers.cloudflare,则重启 Traefik 使配置生效
|
||||
kubectl -n kube-system rollout restart deploy/traefik
|
||||
kubectl -n kube-system logs deploy/traefik --tail=50 | grep -i cloudflare
|
||||
```
|
||||
若 HelmChartConfig 不存在或没有 cloudflare 段,请按本页「创建 Secret」与「配置 HelmChartConfig」重新创建并 apply。
|
||||
|
||||
6. **日志出现 “service not found” / “kubernetes service not found: default/nginx-m2” / “middleware … does not exist”**:说明 Ingress/IngressRoute 已存在,但对应的 **Service 或 Middleware 缺失**(例如只 apply 了部分 TLS 矩阵,或先删后 apply 时 Traefik 在中间时刻读到不完整状态)。需**完整** apply TLS 矩阵,保证 M1~M4 的 Deployment、Service、Middleware、Ingress/IngressRoute 一起就绪:
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nginx-matrix-tls/ -R
|
||||
kubectl get svc,middleware -n default | grep -E "nginx-m|stripprefix"
|
||||
```
|
||||
确认 nginx-m1~m4 的 Service 与 stripprefix-m1~m4 的 Middleware 均存在后,Traefik 会重新同步路由;证书仍需按上一步确保 ACME 配置生效。
|
||||
|
||||
7. **日志出现 “Unable to obtain ACME certificate” 且错误含 “lookup … server misbehaving” 或 “cannot get ACME client get directory”**:说明 **集群内 DNS(CoreDNS)无法解析 Let's Encrypt 的域名**(如 acme-staging-v02.api.letsencrypt.org),Traefik Pod 连不上 ACME 服务器。排查步骤:
|
||||
- 在**节点上**测解析(用节点自身 DNS):`nslookup acme-staging-v02.api.letsencrypt.org` 或 `getent hosts acme-staging-v02.api.letsencrypt.org`。若节点也解析不了,需检查节点出口 DNS 与网络(如 /etc/resolv.conf、防火墙、是否可访问公网)。
|
||||
- 在**集群内**测解析:`kubectl run -it --rm debug --image=busybox --restart=Never -- nslookup acme-staging-v02.api.letsencrypt.org`。若失败,多为 CoreDNS 的上游 DNS 不可达或不允许外网查询;可临时将 CoreDNS 的 forward 改为 8.8.8.8 / 1.1.1.1 等可访问外网的 DNS 后重试。
|
||||
- 若为内网环境且节点无法直接访问外网,需通过 HTTP 代理或允许集群访问外网 DNS/443 后再申请证书。
|
||||
|
||||
8. **排查时开启 Traefik DEBUG 日志**:默认 INFO 下 ACME 只打少量摘要。若要看到 DNS 挑战、TXT 查询、传播等待等更细的日志,可在 `traefik-acme.yaml` 的 `additionalArguments` 最前面增加一行 `- "--log.level=DEBUG"`,然后 apply 并重启 Traefik:
|
||||
```bash
|
||||
# 在 valuesContent 的 additionalArguments 下第一项插入(若尚未存在)
|
||||
sed -i '/additionalArguments:/a\ - "--log.level=DEBUG"' /storage/server/manifests/traefik-acme.yaml
|
||||
# 若已有 additionalArguments 但无 log.level,可手动在 YAML 里加一行:- "--log.level=DEBUG"
|
||||
kubectl -n kube-system apply -f /storage/server/manifests/traefik-acme.yaml
|
||||
kubectl -n kube-system rollout restart deploy/traefik
|
||||
```
|
||||
复现问题后查看详细输出(ACME、DNS、TXT、propagation 等):
|
||||
```bash
|
||||
kubectl -n kube-system logs deploy/traefik --since=15m --all-containers | grep -iE "acme|dnschallenge|TXT|propagation|challenge|authorization|certificate"
|
||||
```
|
||||
排查结束后可将 `--log.level=DEBUG` 删掉或改为 `INFO`,避免生产日志过多。
|
||||
|
||||
---
|
||||
|
||||
## Ansible 一键部署(TLS 矩阵)
|
||||
|
||||
可使用 Ansible 自动部署 / 清理 TLS 矩阵(test01~test04.jackadam.top)并做 HTTPS 验证:
|
||||
|
||||
- **Playbook**:`ansible/playbooks/nginx-matrix-tls-deploy.yml`
|
||||
- **Manifests**:`ansible/files/nginx-matrix-tls/`(M1~M4 带 TLS,域名为 test01~test04.jackadam.top;按实际修改 M2/M4 节点名 ylc61/ylc64)
|
||||
- **前置**:已按本页完成 ACME 配置,且 test01~test04.jackadam.top 已解析到入口 IP
|
||||
|
||||
```bash
|
||||
# 一键部署 TLS 矩阵
|
||||
cd ansible
|
||||
ansible-playbook -i inventory.ini playbooks/nginx-matrix-tls-deploy.yml -e mode=deploy
|
||||
|
||||
# 一键删除 TLS 矩阵
|
||||
cd ansible
|
||||
ansible-playbook -i inventory.ini playbooks/nginx-matrix-tls-deploy.yml -e mode=cleanup
|
||||
```
|
||||
|
||||
Playbook 在 `mode=deploy` 时会:拷贝 TLS manifests 到控制节点 → **若存在不含 TLS 的 nginx 矩阵(02-05),先按资源名删除**(deployments、svc、ingress、ingressroute、configmaps 共 M1~M4)→ `kubectl apply` TLS 矩阵 → 等待 Pod 就绪 → 对**所有 k3s_nodes 节点**做 HTTPS 验证(4 节点 × 4 域名 = 16 个目标,与 02-05 HTTP 矩阵一致,所有节点均为入口点)。`mode=cleanup` 时则按资源名删除 TLS 矩阵相关 Deployment/Service/Ingress/IngressRoute/ConfigMap,并清理 `/tmp/nginx-matrix-tls` 目录,恢复到未部署 TLS 矩阵前的状态。
|
||||
|
||||
---
|
||||
|
||||
## 删除部署与文件
|
||||
|
||||
因同一 chart 只能有一份 HelmChartConfig,后续改做 03-01(Dashboard)或 03-03(Dashboard+ACME 合并)时,建议先删除本部署并删掉 manifest 文件。
|
||||
|
||||
1. **删除集群内 HelmChartConfig**(Traefik 会按 chart 默认重载,ACME 配置失效):
|
||||
|
||||
```bash
|
||||
# 默认路径
|
||||
kubectl delete -f /var/lib/rancher/k3s/server/manifests/traefik-acme.yaml
|
||||
kubectl -n kube-system rollout status deploy/traefik
|
||||
```
|
||||
|
||||
```bash
|
||||
# 自定义 data-dir(如 /storage)
|
||||
kubectl delete -f /storage/server/manifests/traefik-acme.yaml
|
||||
kubectl -n kube-system rollout status deploy/traefik
|
||||
```
|
||||
|
||||
2. **删除宿主机上的 manifest 文件**(否则 K3s 重启会再次加载):
|
||||
|
||||
```bash
|
||||
# 默认路径
|
||||
sudo rm -f /var/lib/rancher/k3s/server/manifests/traefik-acme.yaml
|
||||
```
|
||||
|
||||
```bash
|
||||
# 自定义 data-dir(如 /storage)
|
||||
sudo rm -f /storage/server/manifests/traefik-acme.yaml
|
||||
```
|
||||
|
||||
3. **可选**:nginx 矩阵的删除见 `02-05-nginx-验证矩阵-一键部署.md` 删除小节。Cloudflare API Token 的 Secret(`cloudflare-api-token`)若不再使用可删:`kubectl -n kube-system delete secret cloudflare-api-token`。
|
||||
|
||||
---
|
||||
|
||||
## 注意事项
|
||||
|
||||
- 证书一直不签发:优先检查 DNS 解析与 Cloudflare Token 权限
|
||||
- 首次签发慢:可等待 1-5 分钟再看日志
|
||||
- 仍返回 502:优先回到后端链路排查,不是 ACME 本身问题
|
||||
|
||||
---
|
||||
|
||||
## 相关文档
|
||||
|
||||
- `02-05-nginx-验证矩阵-一键部署.md`(独立 HTTP 矩阵 `/demo-mx`,与本文 TLS 矩阵**非同一套**;调度/标签排障可参考)
|
||||
- `01-02-k3s-工作节点.md`
|
||||
- `03-04-k3s-cloudflare-tunnel-配置接入.md`
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 下一步
|
||||
|
||||
- 返回 00-00-构建总览.md,按导航继续。
|
||||
|
||||
117
docs/03-03-k3s-traefik-dashboard-acme.md
Normal file
117
docs/03-03-k3s-traefik-dashboard-acme.md
Normal file
@@ -0,0 +1,117 @@
|
||||
# 03-03-k3s Traefik Dashboard + ACME
|
||||
|
||||
> 按顺序完成 Traefik Dashboard 与 ACME 自动证书,为后续应用(GitLab、Homer 等)提供 HTTPS 能力。**ACME 配置与 03-03 已实机验证**(2026-03,K3s 4 节点、Cloudflare DNS-01、Let's Encrypt),本页为 Dashboard + ACME 合并版。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-02-k3s-工作节点.md`(集群与 Traefik 可用)
|
||||
- 若使用 Cloudflare DNS 验证:域名托管在 Cloudflare,已获取 API Token
|
||||
|
||||
## 1. 创建 Secret(Cloudflare API Token)
|
||||
|
||||
```bash
|
||||
kubectl -n kube-system create secret generic cloudflare-api-token \
|
||||
--from-literal=api-token='<YOUR_CLOUDFLARE_API_TOKEN>'
|
||||
```
|
||||
|
||||
> **Token 与 Secret 验证**:建议先按 `03-02-k3s-traefik-acme.md` 中「Token 与 Secret 验证」一节做 zone 查询与哈希对比,避免 ACME 失败时根因不明。
|
||||
|
||||
## 2. 完整 HelmChartConfig(Dashboard + ACME 合并)
|
||||
|
||||
> 说明:Traefik 的 `HelmChartConfig` 只能有一份,Dashboard 与 ACME 需合并在同一文件中。**ACME 配置基于 03-03 实机验证**(递归 DNS、propagation 等待、ping、PROXY protocol、nodeSelector)。
|
||||
|
||||
创建 `traefik-dashboard-acme.yaml`,推荐放入 K3s manifests 目录(路径同 03-02)。**唯一真源**:[HelmChartConfig 完整 YAML](../../ansible/files/traefik-dashboard-acme/traefik-dashboard-acme.yaml),复制后替换 `<YOUR_REAL_EMAIL>` 等占位符;或在仓库根执行 `kubectl apply -f ansible/files/traefik-dashboard-acme/traefik-dashboard-acme.yaml`。
|
||||
|
||||
> 将 `<YOUR_REAL_EMAIL>` 替换为你的邮箱。正式上线前删除 `caserver` 该行即切回生产 Let's Encrypt。**ACME 排障**(DNS 解析错误、证书解析器不存在等)见 `03-02-k3s-traefik-acme.md` 中「常见问题」与「排查」小节。
|
||||
|
||||
## 3. 部署(按实际路径选择其一复制执行)
|
||||
|
||||
```bash
|
||||
# 默认路径
|
||||
kubectl apply -f /var/lib/rancher/k3s/server/manifests/traefik-dashboard-acme.yaml
|
||||
kubectl -n kube-system rollout status deploy/traefik
|
||||
kubectl -n kube-system logs deploy/traefik --tail=100 | grep -i acme || true
|
||||
```
|
||||
|
||||
```bash
|
||||
# 自定义 data-dir(如 /storage)
|
||||
kubectl apply -f /storage/server/manifests/traefik-dashboard-acme.yaml
|
||||
kubectl -n kube-system rollout status deploy/traefik
|
||||
kubectl -n kube-system logs deploy/traefik --tail=100 | grep -i acme || true
|
||||
```
|
||||
|
||||
## 4. 验证
|
||||
|
||||
```bash
|
||||
# Dashboard:临时端口转发访问
|
||||
kubectl -n kube-system port-forward deploy/traefik 9000:9000
|
||||
# 浏览器打开 http://127.0.0.1:9000/dashboard/
|
||||
|
||||
# ACME 日志
|
||||
kubectl -n kube-system logs deploy/traefik --tail=100 | grep -i acme
|
||||
```
|
||||
|
||||
为 Ingress 使用 `tls` + `certResolver: cloudflare` 后,可通过 `curl -Iv https://你的域名` 检查证书是否由 Let's Encrypt 签发。
|
||||
|
||||
---
|
||||
|
||||
## 5. 使用 Tomcat + test05.jackadam.top 验证 HTTPS
|
||||
|
||||
> 本节给出一个**完整、独立**的 Tomcat 示例:包含 Deployment + Service + Ingress(三段 YAML),域名为 `test05.jackadam.top`。前提是已经按本页前文配置并成功加载了 ACME(`traefik-acme.yaml` 或 `traefik-dashboard-acme.yaml`)。
|
||||
|
||||
1. **唯一真源**:[`ansible/files/traefik-dashboard-acme/tomcat-acme-test05.yaml`](../../ansible/files/traefik-dashboard-acme/tomcat-acme-test05.yaml)。将其中域名改成你实际解析到集群入口 IP 的 FQDN。
|
||||
|
||||
2. 应用并查看 ACME 日志 + 访问验证:
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/traefik-dashboard-acme/tomcat-acme-test05.yaml
|
||||
|
||||
# 查看 ACME 相关日志(证书申请、签发情况)
|
||||
kubectl -n kube-system logs deploy/traefik --tail=200 | grep -i acme || true
|
||||
|
||||
# 使用 --resolve 覆盖 DNS,将域名指向入口 IP 验证 HTTPS
|
||||
curl -Iv https://test05.jackadam.top --resolve test05.jackadam.top:443:192.168.2.61
|
||||
```
|
||||
|
||||
若 ACME 与 Cloudflare 配置正确,Traefik 日志中将看到针对 `test05.jackadam.top` 的证书申请与成功信息;`curl -Iv` 输出中应展示 Let's Encrypt 证书,浏览器访问 `https://test05.jackadam.top` 时会看到 Tomcat 默认首页。
|
||||
|
||||
---
|
||||
|
||||
## 6. 删除部署与文件
|
||||
|
||||
后续若改回 03-02(仅 Dashboard)或 03-03(仅 ACME)单独测试时,需先删除本部署并删掉 manifest 文件。
|
||||
|
||||
1. **删除集群内 HelmChartConfig**:
|
||||
|
||||
```bash
|
||||
# 默认路径
|
||||
kubectl delete -f /var/lib/rancher/k3s/server/manifests/traefik-dashboard-acme.yaml
|
||||
kubectl -n kube-system rollout status deploy/traefik
|
||||
```
|
||||
|
||||
```bash
|
||||
# 自定义 data-dir(如 /storage)
|
||||
kubectl delete -f /storage/server/manifests/traefik-dashboard-acme.yaml
|
||||
kubectl -n kube-system rollout status deploy/traefik
|
||||
```
|
||||
|
||||
2. **删除宿主机上的 manifest 文件**(否则 K3s 重启会再次加载):
|
||||
|
||||
```bash
|
||||
# 默认路径
|
||||
sudo rm -f /var/lib/rancher/k3s/server/manifests/traefik-dashboard-acme.yaml
|
||||
```
|
||||
|
||||
```bash
|
||||
# 自定义 data-dir(如 /storage)
|
||||
sudo rm -f /storage/server/manifests/traefik-dashboard-acme.yaml
|
||||
```
|
||||
|
||||
3. **可选**:若不再使用 ACME,可删除 Secret:`kubectl -n kube-system delete secret cloudflare-api-token`。
|
||||
|
||||
## 7. 下一步
|
||||
|
||||
- `03-02-k3s-traefik-acme.md`:仅 ACME 不合并 Dashboard 时,或 TLS 矩阵(test01~test04)验证、排障详情
|
||||
- `03-04-k3s-cloudflare-tunnel-配置接入.md`:若需 Cloudflare Tunnel 接入
|
||||
- `01-08-openwrt-haproxy.md`:如需调整外部端口/防火墙,参考 HAProxy 监听与转发(第 6 节)
|
||||
|
||||
58
docs/03-04-k3s-cloudflare-tunnel-配置接入.md
Normal file
58
docs/03-04-k3s-cloudflare-tunnel-配置接入.md
Normal file
@@ -0,0 +1,58 @@
|
||||
# 03-05-k3s Cloudflare Tunnel 配置接入
|
||||
|
||||
> 本文只讲 K3s 侧如何接入 Cloudflare Tunnel(`cloudflared` 部署、验证、排查)。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-04-cloudflare-tunnel.md`
|
||||
- 已拿到 Tunnel Token 或凭据文件
|
||||
- Traefik 已可用(单节点/多节点均可)
|
||||
|
||||
## 操作步骤
|
||||
|
||||
1. 在 K3s 中创建保存 token/凭据的 Secret + Deployment。**唯一真源**:[`ansible/files/cloudflare-tunnel/cloudflared.yaml`](../ansible/files/cloudflare-tunnel/cloudflared.yaml)(替换 `TUNNEL_TOKEN` 占位符)。
|
||||
|
||||
2. 部署 `cloudflared` 并确保重启后自动生效(按实际路径选择其一复制执行):
|
||||
|
||||
```bash
|
||||
# 默认路径
|
||||
kubectl apply -f /var/lib/rancher/k3s/server/manifests/cloudflared.yaml
|
||||
kubectl -n kube-system rollout status deploy/cloudflared
|
||||
```
|
||||
|
||||
```bash
|
||||
# 自定义 data-dir(如 /storage)
|
||||
kubectl apply -f /storage/server/manifests/cloudflared.yaml
|
||||
kubectl -n kube-system rollout status deploy/cloudflared
|
||||
```
|
||||
|
||||
3. 将 `cloudflared.yaml` 放入上述 manifests 目录后,K3s 重启时会自动加载。
|
||||
|
||||
建议要点:
|
||||
|
||||
- 使用官方 `cloudflared` 镜像
|
||||
- Secret 不写死在明文 YAML
|
||||
- `cloudflared` 放在 `kube-system` 或专用 namespace
|
||||
|
||||
## 验证命令
|
||||
|
||||
```bash
|
||||
kubectl -n kube-system get pods | grep cloudflared
|
||||
kubectl -n kube-system logs deploy/cloudflared --tail=100
|
||||
```
|
||||
|
||||
## 预期
|
||||
|
||||
- 日志中可见 tunnel connected
|
||||
- 访问域名可到达 Traefik 路由
|
||||
|
||||
## 失败排查
|
||||
|
||||
- 域名解析正常但访问超时:先看 Tunnel 状态与 `cloudflared` 日志
|
||||
- 返回 `404`:通常是 Traefik 路由未命中
|
||||
- 返回 `502`:优先排查后端链路(`06-01-k3s-networkpolicy-故障排查.md`)
|
||||
|
||||
## 下一步
|
||||
|
||||
- `05-03-k3s-安装gitlab-含runner.md`
|
||||
- `05-01-k3s-部署homer首页面板.md`
|
||||
51
docs/03-05-k3s-local-path-pvc.md
Normal file
51
docs/03-05-k3s-local-path-pvc.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# 03-06-k3s local-path PVC 本地持久化
|
||||
|
||||
> K3s 自带的 **local-path-provisioner**:通过 PVC 自动创建本地 PersistentVolume,适用于单副本应用、缓存、日志等,无需 NFS 或 Longhorn。
|
||||
|
||||
## 与 NFS / Longhorn 的区别
|
||||
|
||||
| 方式 | 共享 | 适用场景 |
|
||||
|------|------|----------|
|
||||
| **local-path**(本页) | 否,单节点 | 单副本应用(Traefik acme.json、单机数据库等),Pod 固定调度到同一节点 |
|
||||
| **NFS**(`03-08`) | 是,多节点读写 | 多副本共享目录、需跨节点访问 |
|
||||
| **Longhorn**(`03-09`) | 块存储,CSI | 重状态系统、快照/备份、生产推荐 |
|
||||
|
||||
## 前置条件
|
||||
|
||||
- K3s 已安装(local-path-provisioner 默认启用)
|
||||
- 无额外组件,`kubectl get storageclass` 可见 `local-path`(通常为默认)
|
||||
|
||||
## 操作步骤
|
||||
|
||||
### 1. 清单(PVC + Deployment)
|
||||
|
||||
**唯一真源**:[`ansible/files/local-path-demo/local-path-pvc-demo.yaml`](../ansible/files/local-path-demo/local-path-pvc-demo.yaml)(含 PVC `local-pvc-demo` 与 `nginx-local-pvc-demo` Deployment;`storageClassName` 可省略,K3s 默认多为 `local-path`)。
|
||||
|
||||
### 2. 应用与验证
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/local-path-demo/local-path-pvc-demo.yaml
|
||||
|
||||
kubectl get pv,pvc
|
||||
kubectl get pod -o wide
|
||||
kubectl exec deploy/nginx-local-pvc-demo -- sh -c 'echo hello > /usr/share/nginx/html/test.txt'
|
||||
kubectl delete pod -l app=nginx-local-pvc-demo
|
||||
kubectl exec deploy/nginx-local-pvc-demo -- cat /usr/share/nginx/html/test.txt # 应仍为 hello
|
||||
```
|
||||
|
||||
## 注意事项
|
||||
|
||||
- **绑定到节点**:PV 创建在 Pod 首次调度到的节点上,Pod 重建后仍会调度到该节点(provisioner 会打 nodeAffinity)
|
||||
- **单副本**:`ReadWriteOnce`,同一 PVC 只能被同一节点上的一个 Pod 挂载;多副本需 NFS 或 Longhorn
|
||||
- **数据路径**:默认在 K3s `--data-dir` 下的 `storage`,如 `/var/lib/rancher/k3s/storage` 或 `/storage`
|
||||
- **回收策略**:`Delete`,删除 PVC 时 PV 及本地目录会被清理
|
||||
|
||||
## Traefik acme.json 示例
|
||||
|
||||
若希望 Traefik 的 ACME 证书走 local-path PVC,需在 HelmChartConfig 的 values 中为 Traefik 配置 volume 与 volumeMount(见 `03-02-k3s-traefik-acme.md` 可选配置)。多数场景下,配合 `nodeSelector` 固定 Traefik 到同一节点,再用 hostPath 或 local-path 均可;无 hostPath 时 K3s 默认会为 Traefik 挂 emptyDir 或默认卷。
|
||||
|
||||
## 下一步
|
||||
|
||||
- `03-06-k3s-使用nfs存储.md`:需多节点共享时
|
||||
- `03-07-k3s-longhorn-持久化存储.md`:重状态、快照、备份
|
||||
|
||||
54
docs/03-06-k3s-使用nfs存储.md
Normal file
54
docs/03-06-k3s-使用nfs存储.md
Normal file
@@ -0,0 +1,54 @@
|
||||
# 03-07-k3s 使用 NFS 存储
|
||||
|
||||
> 本文只讲 K3s 集群侧如何使用已安装好的 NFS。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-06-armv7-nfs服务安装.md`
|
||||
- 可从 K3s 节点访问 NFS 服务器与导出目录
|
||||
|
||||
## 操作步骤
|
||||
|
||||
1. 创建 NFS 类型 `PersistentVolume`
|
||||
2. 创建 `PersistentVolumeClaim`
|
||||
3. 在业务 Pod 中挂载 PVC
|
||||
|
||||
**唯一真源**:[`ansible/files/nfs-demo/nfs-pv-pvc-demo.yaml`](../ansible/files/nfs-demo/nfs-pv-pvc-demo.yaml)(按你的 NFS `server` / `path` 修改)。
|
||||
|
||||
## 验证命令(若 YAML 在 manifests 目录,按实际路径选择其一复制执行)
|
||||
|
||||
```bash
|
||||
# 仓库根直接应用
|
||||
kubectl apply -f ansible/files/nfs-demo/nfs-pv-pvc-demo.yaml
|
||||
```
|
||||
|
||||
```bash
|
||||
# 或默认路径(已拷贝到 manifests 时)
|
||||
kubectl apply -f /var/lib/rancher/k3s/server/manifests/nfs-pv-pvc.yaml
|
||||
kubectl get pv,pvc -A
|
||||
kubectl describe pv nfs-pv-demo
|
||||
```
|
||||
|
||||
```bash
|
||||
# 自定义 data-dir(如 /storage)
|
||||
kubectl apply -f /storage/server/manifests/nfs-pv-pvc.yaml
|
||||
kubectl get pv,pvc -A
|
||||
kubectl describe pv nfs-pv-demo
|
||||
```
|
||||
|
||||
## 预期
|
||||
|
||||
- PV/PVC 状态为 `Bound`
|
||||
- 业务 Pod 可读写挂载目录
|
||||
|
||||
## 失败排查
|
||||
|
||||
- 检查 NFS 服务与导出目录权限
|
||||
- 检查节点到 NFS 服务器网络
|
||||
- 检查 `path` 与 `server` 配置是否正确
|
||||
|
||||
## 下一步
|
||||
|
||||
- `03-05-k3s-local-path-pvc.md`:单副本应用用 K3s 自带 local-path 即可,无需 NFS
|
||||
- `05-06-openlist挂载网盘与自动备份.md`
|
||||
|
||||
114
docs/03-07-k3s-longhorn-持久化存储.md
Normal file
114
docs/03-07-k3s-longhorn-持久化存储.md
Normal file
@@ -0,0 +1,114 @@
|
||||
# 03-08-k3s Longhorn 持久化存储(单节点自用生产)
|
||||
|
||||
> 适用:**没有 NFS**、希望在 K3s 中部署 GitLab 等“重状态”系统,并且能接受“单节点不做高可用、但要可重建/可备份”。
|
||||
|
||||
---
|
||||
|
||||
## 为什么要用 Longhorn(而不是 hostPath / local-path / 容器文件系统)
|
||||
|
||||
- **容器文件系统**:Pod 重建即丢,基本不可用
|
||||
- **hostPath 固定目录**:能落盘,但和调度强绑定,迁移/扩缩容/备份都更麻烦
|
||||
- **local-path PVC**(`03-07`):K3s 自带,单副本够用;无快照/备份,多副本需 NFS 或 Longhorn
|
||||
- **Longhorn(CSI 块存储)**:对 K8s 来说是标准 PVC;即使你只设 **副本数=1**,也能获得:
|
||||
- 统一的 PVC 管理与回收策略
|
||||
- 快照(snapshot)
|
||||
- 备份(backup target,可推到对象存储)
|
||||
|
||||
> 重要:单节点 + 副本=1 **不是高可用**。想要节点级容灾,需要多节点副本或备份到外部介质。
|
||||
|
||||
---
|
||||
|
||||
## 前置条件(CentOS)
|
||||
|
||||
在所有计划作为 Longhorn 存储节点的机器上安装依赖(单节点就只装这一台):
|
||||
|
||||
```bash
|
||||
sudo yum install -y iscsi-initiator-utils nfs-utils
|
||||
sudo systemctl enable --now iscsid
|
||||
```
|
||||
|
||||
准备数据盘目录(你已有专用盘挂载到 `/storage`,建议给 Longhorn 单独目录):
|
||||
|
||||
```bash
|
||||
sudo mkdir -p /storage/longhorn
|
||||
sudo chmod 700 /storage/longhorn
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 安装 Longhorn
|
||||
|
||||
```bash
|
||||
kubectl apply -f https://raw.githubusercontent.com/longhorn/longhorn/v1.7.2/deploy/longhorn.yaml
|
||||
kubectl -n longhorn-system rollout status deploy/longhorn-ui
|
||||
```
|
||||
|
||||
将 Longhorn 默认数据路径改到 `/storage/longhorn`:
|
||||
|
||||
```bash
|
||||
kubectl -n longhorn-system patch settings.longhorn.io default-data-path \
|
||||
--type=merge -p '{"value":"/storage/longhorn"}'
|
||||
```
|
||||
|
||||
将 `longhorn` 设为默认 StorageClass(推荐):
|
||||
|
||||
```bash
|
||||
kubectl get storageclass
|
||||
kubectl patch storageclass longhorn -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 单节点“非 HA”建议配置
|
||||
|
||||
### 副本数
|
||||
|
||||
- 建议将 Longhorn 的 **默认副本数**设为 1(节省空间,也符合“非 HA”定位)
|
||||
- 需要迁移卷/临时容灾时,可手动把某个卷副本数调到 2,待同步完成再调回 1
|
||||
|
||||
### 只让“有大盘”的节点承载数据
|
||||
|
||||
如果你是多节点集群但只有少数节点有 `/storage` 大盘:
|
||||
|
||||
- 只把这些节点加入 Longhorn 可调度存储(Longhorn UI 中将其它节点的 disk 设为不可调度)
|
||||
- 或者给存储节点打标签,配合工作负载的 nodeSelector/affinity(让应用尽量靠近数据)
|
||||
|
||||
> 注意:副本=1 时,卷不会“随使用自动从小盘迁到大盘”,需要你手动迁移或从源头限制调度。
|
||||
|
||||
---
|
||||
|
||||
## GitLab 这类重状态系统如何落地
|
||||
|
||||
原则:**所有关键组件都用 PVC**(Longhorn)。
|
||||
|
||||
- **必须 PVC**:PostgreSQL、Redis、Gitaly(repo)、uploads/artifacts/packages、registry(如启用)
|
||||
- **备份**:
|
||||
- 应用层(GitLab 自带 backup)+ 存储层(Longhorn snapshot/backup)双保险
|
||||
|
||||
---
|
||||
|
||||
## Traefik `acme.json` 如何持久化/备份(可选)
|
||||
|
||||
Traefik 的 ACME 状态很小,但也建议持久化以避免重建后触发频繁签发。
|
||||
|
||||
- **推荐**:给 Traefik 的 `/data` 使用 PVC(Longhorn),让 `acme.json` 走标准持久化
|
||||
- **兜底**:定期导出 `acme.json` 备份(例如 `kubectl cp pod/<traefik-pod>:/data/acme.json ...`)
|
||||
|
||||
---
|
||||
|
||||
## 验证
|
||||
|
||||
```bash
|
||||
kubectl -n longhorn-system get pod
|
||||
kubectl get pvc -A
|
||||
kubectl get pv
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 下一步
|
||||
|
||||
- `03-05-k3s-local-path-pvc.md`:单副本、无快照需求时,用 K3s 自带 local-path 即可
|
||||
- 返回 `03-09-k3s-gitops-集群配置管理.md` 或进入业务部署(如 GitLab)章节
|
||||
|
||||
|
||||
81
docs/03-08-k3s-ha-集群配置与切换.md
Normal file
81
docs/03-08-k3s-ha-集群配置与切换.md
Normal file
@@ -0,0 +1,81 @@
|
||||
# 03-09-k3s HA 集群配置与切换
|
||||
|
||||
> 本文只讲双控制节点 HA 的集群配置与切换步骤。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-05-双控制节点ha.md` 安装准备
|
||||
- 外部 datastore 与 `6443` LB 已可用
|
||||
- 已确认可执行变更窗口
|
||||
|
||||
## 操作步骤
|
||||
|
||||
1. 在首个 server 配置外部 datastore 参数
|
||||
2. 第二个 server 使用一致参数加入
|
||||
3. 将 worker 与 kubeconfig 的 API 地址切换到 LB 地址
|
||||
4. 校验所有节点与核心组件健康
|
||||
|
||||
一个简化的两节点 server 启动示例(仅用于帮助理解参数含义):
|
||||
|
||||
```bash
|
||||
# server1(例如 192.168.2.61)
|
||||
sudo k3s server \
|
||||
--datastore-endpoint="postgres://k3s:strong-password@192.168.2.50:5432/k3s?sslmode=disable" \
|
||||
--tls-san 192.168.2.60
|
||||
|
||||
# server2(例如 192.168.2.63),使用相同 datastore 与 token:
|
||||
sudo k3s server \
|
||||
--server https://192.168.2.60:6443 \
|
||||
--token <SAME_TOKEN> \
|
||||
--datastore-endpoint="postgres://k3s:strong-password@192.168.2.50:5432/k3s?sslmode=disable" \
|
||||
--tls-san 192.168.2.60
|
||||
```
|
||||
|
||||
> 实际执行时,请优先参考官方 HA 文档与本仓库步骤,将上述命令转化为持久化的 systemd 配置或安装脚本参数。
|
||||
|
||||
### 推荐:将现有 worker 升级为第二控制节点的顺序
|
||||
|
||||
假设你已有一个控制节点(server1,示例 `192.168.2.61`)和一个工作节点(示例 `192.168.2.63`),希望把 `192.168.2.63` 升级为第二控制节点,大致顺序建议如下:
|
||||
|
||||
1. **在首个 server 上完成外部 datastore 与 LB 切换**
|
||||
- 按前文「server1 启动示例」准备好外部 datastore 参数与 `--tls-san`(LB 地址)。
|
||||
- 确保此时集群仍然健康,`kubectl get nodes` 只有一个 `control-plane` 节点为 `Ready`。
|
||||
2. **排空并停止原有 worker**
|
||||
- 可选:使用 `kubectl drain 192.168.2.63 --ignore-daemonsets --delete-emptydir-data` 将工作负载迁走;
|
||||
- 在该节点上停止 `k3s-agent` 服务(或执行 `k3s-agent-uninstall.sh`),避免 agent 与后续 server 角色冲突。
|
||||
3. **在该节点以 server 角色重新加入**
|
||||
- 使用与 server1 相同的 token、datastore、LB 地址,执行 k3s server 安装(命令示例参考上面的 server2 启动片段);
|
||||
- 确保 `--server` 指向 LB 地址(例如 `https://192.168.2.60:6443`),而不是单一节点 IP。
|
||||
4. **重新标记工作负载调度策略**
|
||||
- 根据需要为新 server 添加/调整 `node-role` 或 `taints`,决定是否在控制节点上承载工作负载;
|
||||
- 再次查看 `kubectl get nodes -o wide`,确认两个 server 都为 `Ready`,原 worker 已成功「升级」为控制节点。
|
||||
|
||||
## 验证命令
|
||||
|
||||
```bash
|
||||
kubectl get nodes -o wide
|
||||
kubectl get pods -A
|
||||
```
|
||||
|
||||
进行一次故障演练:停止任意一个 server,确认 API 仍可访问。
|
||||
|
||||
## 预期
|
||||
|
||||
- 两个 server 都为 `Ready`
|
||||
- 控制平面故障切换后,集群仍可管理
|
||||
|
||||
## 失败排查
|
||||
|
||||
- 检查 datastore 连通性与账号权限
|
||||
- 检查 LB 后端健康与 6443 转发
|
||||
- 检查两个 server 参数是否一致
|
||||
|
||||
## 参考
|
||||
|
||||
- `01-05-双控制节点ha.md`
|
||||
- `01-01-k3s-控制节点含traefik.md`
|
||||
- `01-02-k3s-工作节点.md`
|
||||
|
||||
## 下一步
|
||||
|
||||
- 返回 00-00-构建总览.md,按导航继续。
|
||||
56
docs/03-09-k3s-gitops-集群配置管理.md
Normal file
56
docs/03-09-k3s-gitops-集群配置管理.md
Normal file
@@ -0,0 +1,56 @@
|
||||
# 03-09-k3s-gitops-集群配置管理(框架草案)
|
||||
|
||||
> 本文先给出 GitOps 管理 k3s 集群的大致框架,后续可以按需要再细化成完整实践。
|
||||
> 目标:在 `01-07` 自动装好 k3s 之后,由 GitOps 工具(Argo CD / Flux)自动把 Traefik、监控、应用等 YAML 下发到集群。
|
||||
|
||||
## 1. 选型与边界
|
||||
|
||||
- GitOps 工具二选一:
|
||||
- **Argo CD**:UI 友好、概念清晰,适合个人实验;
|
||||
- **Flux**:更轻量,完全 Git 驱动,命令行为主。
|
||||
- 建议先选其中一个,**不要在同一集群同时跑两套 GitOps**。
|
||||
|
||||
## 2. 仓库结构建议(与本仓库的关系)
|
||||
|
||||
建议将「集群声明性配置」与本仓库代码/文档区分开,形成一个专门的 GitOps 仓库,例如:
|
||||
|
||||
```text
|
||||
homelab-k3s-gitops/
|
||||
clusters/
|
||||
ylc-k3s-01/
|
||||
kustomization.yaml
|
||||
apps/
|
||||
traefik/
|
||||
monitoring/
|
||||
homer/
|
||||
openlist/
|
||||
gitlab/
|
||||
```
|
||||
|
||||
本仓库依然作为「文档 + 脚本 + ansible playbook」的入口,GitOps 仓库只存 K8s 清单。
|
||||
|
||||
## 3. 最小 Argo CD 部署思路(示意)
|
||||
|
||||
> 如果未来你决定使用 Argo CD,可以按以下思路展开(这里不给出完整清单,仅做导航):
|
||||
|
||||
1. 在 k3s 集群中安装 Argo CD(官方 `install.yaml` 或 Helm);
|
||||
2. 暴露 Argo CD Server(通过 Traefik IngressRoute 或 NodePort);
|
||||
3. 在 Argo CD 中创建一个 Application,指向 GitOps 仓库的 `clusters/ylc-k3s-01`;
|
||||
4. 在 `clusters/ylc-k3s-01/kustomization.yaml` 中列出:
|
||||
- Traefik 扩展配置;
|
||||
- Prometheus+Grafana;
|
||||
- Homer、openlist、GitLab 等应用的 Kustomize/Helm 目录。
|
||||
|
||||
## 4. 与现有文档的衔接
|
||||
|
||||
- `01-07-节点初始化-ansible-实践.md`:负责从「可 SSH 裸机」到「k3s 就绪」;
|
||||
- 本篇 `03-11`:负责从「k3s 就绪」到「配置由 Git 驱动下发」;
|
||||
- 其他 `02-**`、`04-**`、`05-**` 文档中的部署命令,可以逐步迁移为 GitOps 仓库中的 YAML/Kustomize/Helm 定义。
|
||||
|
||||
## 5. 后续可以补充的内容(TODO)
|
||||
|
||||
- 针对 Argo CD 或 Flux 选定一个具体方案,写出:
|
||||
- 详细安装步骤;
|
||||
- GitOps 仓库的完整示例结构;
|
||||
- 与 Cloudflare Tunnel、监控、openlist 等现有专题的映射关系。
|
||||
|
||||
81
docs/04-01-k3s-nodejs-高级部署.md
Normal file
81
docs/04-01-k3s-nodejs-高级部署.md
Normal file
@@ -0,0 +1,81 @@
|
||||
# 04-01-k3s Node.js 高级部署
|
||||
|
||||
> Node.js 属于 `04` 高级部署序列。
|
||||
> 本文作为 Node.js 主入口:先跑通基础链路,再扩展到自定义端口、存储与构建。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `01-02-k3s-工作节点.md`
|
||||
- 已完成 `01-04-cloudflare-tunnel.md`(如需外网入口)
|
||||
- 已完成 `03-01-k3s-traefik-dashboard.md`(可选,便于观察路由)
|
||||
|
||||
## 基础部署步骤
|
||||
|
||||
1. 创建 Node.js Deployment(默认监听 `3000`)
|
||||
2. 创建 Service(`80 -> 3000`)
|
||||
3. 创建 Ingress 或 IngressRoute(路径建议 `/node`)
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
完整 YAML 不在本文重复粘贴,请以仓库内文件为准(与 Ansible 共用):
|
||||
|
||||
| 项 | 路径 / 命令 |
|
||||
|----|-------------|
|
||||
| 清单文件 | [`ansible/files/nodejs-demo/04-01-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-01-nodejs-demo.yaml) |
|
||||
| 手工应用 | `kubectl apply -f ansible/files/nodejs-demo/04-01-nodejs-demo.yaml` |
|
||||
| Ansible | `ansible-playbook -i ansible/inventory.ini ansible/playbooks/nodejs-demo-apply.yml -e nodejs_demo_manifest=04-01-nodejs-demo.yaml` |
|
||||
|
||||
索引与累积说明见 [`ansible/files/nodejs-demo/README.md`](../ansible/files/nodejs-demo/README.md)。
|
||||
|
||||
### 相对上游
|
||||
|
||||
本文为 **基线**,无上游清单;以下为资源摘要(便于检索,**以 YAML 文件为准**)。
|
||||
|
||||
| 资源 | 要点 |
|
||||
|------|------|
|
||||
| Deployment `nodejs-demo` | `node:18-alpine`,监听 **3000**,`replicas: 1` |
|
||||
| Service | `80` → `targetPort: 3000` |
|
||||
| Ingress | `entrypoints: web`,路径 **`/node`**,无 `host` |
|
||||
|
||||
应用方式:
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nodejs-demo/04-01-nodejs-demo.yaml
|
||||
```
|
||||
|
||||
## 基础验证
|
||||
|
||||
```bash
|
||||
kubectl get pod,svc,ing -n default -o wide
|
||||
curl -s --max-time 3 http://192.168.2.61/node/
|
||||
curl -s --max-time 3 http://192.168.2.62/node/
|
||||
```
|
||||
|
||||
预期:返回应用内容(如 `Hello World from Node.js`)。
|
||||
|
||||
|
||||
## 失败排查
|
||||
|
||||
统一看:`06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 部署阶段扩展(分项导航)
|
||||
|
||||
在本文 `nodejs-demo` 基线上按主题增量实践(建议顺序大致由上到下)。**每篇分项均链接到 `ansible/files/nodejs-demo/` 下累积清单**,并附 **相对上一篇的变更表**;与 [`ansible/playbooks/nodejs-demo-apply.yml`](../ansible/playbooks/nodejs-demo-apply.yml) 共用。
|
||||
|
||||
- `04-02-nodejs-镜像与运行命令.md`:镜像 tag、`imagePullPolicy`、`command`/`args`
|
||||
- `04-03-nodejs-环境变量与配置注入.md`:ConfigMap/Secret、`env`/`envFrom`
|
||||
- `04-04-nodejs-端口与Service.md`:`containerPort` 与 Service/Ingress 端口对应
|
||||
- `04-05-nodejs-资源请求与限制.md`:`resources`、OOM/CPU 节流
|
||||
- `04-06-nodejs-探针与健康检查.md`:存活/就绪/启动探针
|
||||
- `04-07-nodejs-调度与亲和.md`:`nodeSelector`、亲和、容忍
|
||||
- `04-08-nodejs-安全上下文.md`:`securityContext`、非 root、只读根等
|
||||
- `04-09-nodejs-存储与卷.md`:`emptyDir`、PVC、配置卷挂载
|
||||
- `04-10-nodejs-Ingress与Traefik.md`:路径、主机名、`web`/`websecure`
|
||||
- `04-11-nodejs-副本与滚动发布.md`:`replicas`、滚动策略
|
||||
- `04-12-nodejs-TLS与证书.md`:Ingress `tls`、HTTPS(与 `03-02` ACME 配合)
|
||||
- `04-13-nodejs-HPA.md`:水平自动扩缩容
|
||||
- `04-14-nodejs-GitOps与CI流水线.md`:构建镜像、GitOps/CI 闭环
|
||||
|
||||
## 下一步
|
||||
|
||||
- 返回 `00-00-构建总览.md`,按导航继续。
|
||||
67
docs/04-02-nodejs-镜像与运行命令.md
Normal file
67
docs/04-02-nodejs-镜像与运行命令.md
Normal file
@@ -0,0 +1,67 @@
|
||||
# 04-02-nodejs-镜像与运行命令
|
||||
|
||||
> 在 [`04-01-k3s-nodejs-高级部署.md`](04-01-k3s-nodejs-高级部署.md) 的 `nodejs-demo` 基线上,调整**镜像**与**进程启动方式**。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已按 `04-01` 部署并验证 `curl` 可达。
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
| 项 | 路径 / 命令 |
|
||||
|----|-------------|
|
||||
| 本篇完整清单(累积至 04-02) | [`ansible/files/nodejs-demo/04-02-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-02-nodejs-demo.yaml) |
|
||||
| 手工应用 | `kubectl apply -f ansible/files/nodejs-demo/04-02-nodejs-demo.yaml` |
|
||||
| Ansible | `ansible-playbook -i ansible/inventory.ini ansible/playbooks/nodejs-demo-apply.yml -e nodejs_demo_manifest=04-02-nodejs-demo.yaml` |
|
||||
|
||||
若你更喜欢命令行换镜像,文末也给了 **`kubectl set image`**,可不改仓库清单。
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
- **换镜像版本**:就像本地 `docker pull node:18.20-alpine`,K8s 里改 `image:` 一行即可;写死版本号比总写 `latest` 更容易排查「昨天还能跑今天不行」。
|
||||
- **何时拉镜像(imagePullPolicy)**:节点上还没有这个镜像时肯定要拉;若 CI 总往同一个 tag 里覆盖推送,一般要 **`Always`**,否则会用到旧层。
|
||||
- **改启动命令**:镜像自带的入口不满足时,用 `command` / `args` 告诉 K8s「用哪条命令起 Node」;和 Docker 里覆盖 `ENTRYPOINT`/`CMD` 一个意思。
|
||||
- **NODE_OPTIONS 等**:适合放在环境变量里,见 [`04-03-nodejs-环境变量与配置注入.md`](04-03-nodejs-环境变量与配置注入.md)。
|
||||
|
||||
### 相对 `04-01` 的变更(原文 → 新文)
|
||||
|
||||
| 位置 | 原文(`04-01`) | 新文(`04-02`) |
|
||||
|------|-----------------|-----------------|
|
||||
| `containers[].image` | `node:18-alpine` | `node:18.20-alpine` |
|
||||
| `containers[].imagePullPolicy` | (默认) | `IfNotPresent` |
|
||||
| `containers[].command` / `args` | 单行 `["node","-e","...Hello World...listen(3000)"]` | `command: ["node"]` + `args` 两段,`res.end('Hello from pinned image')` |
|
||||
|
||||
应用:
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nodejs-demo/04-02-nodejs-demo.yaml
|
||||
# 或仅打补丁(示意)
|
||||
kubectl set image deployment/nodejs-demo nodejs-demo=node:18.20-alpine -n default
|
||||
```
|
||||
|
||||
## 验证
|
||||
|
||||
```bash
|
||||
kubectl describe pod -l app=nodejs-demo -n default | sed -n '/Image:/,/Port:/p'
|
||||
kubectl get pod -l app=nodejs-demo -n default
|
||||
curl -s --max-time 3 http://<节点IP>/node/
|
||||
```
|
||||
|
||||
预期:Pod 为 Running;响应体与当前 `command`/`args` 一致。
|
||||
|
||||
## 删除 / 回滚
|
||||
|
||||
```bash
|
||||
kubectl rollout undo deployment/nodejs-demo -n default
|
||||
# 或恢复 04-01 原始 YAML 后 kubectl apply -f nodejs-demo.yaml
|
||||
```
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **ImagePullBackOff**:镜像名/tag 错误、私有仓库未配置 `imagePullSecrets`、节点无法访问 registry。
|
||||
- 统一网络与策略:`06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [`04-03-nodejs-环境变量与配置注入.md`](04-03-nodejs-环境变量与配置注入.md)
|
||||
- [`04-05-nodejs-资源请求与限制.md`](04-05-nodejs-资源请求与限制.md)
|
||||
65
docs/04-03-nodejs-环境变量与配置注入.md
Normal file
65
docs/04-03-nodejs-环境变量与配置注入.md
Normal file
@@ -0,0 +1,65 @@
|
||||
# 04-03-nodejs-环境变量与配置注入
|
||||
|
||||
> 在 [`04-01-k3s-nodejs-高级部署.md`](04-01-k3s-nodejs-高级部署.md) 基线上,用 **ConfigMap / Secret** 与 **`env` / `envFrom`** 注入配置,避免把敏感信息写进镜像或 Deployment 明文。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已部署 `nodejs-demo`(`04-01`)。
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
| 项 | 路径 / 命令 |
|
||||
|----|-------------|
|
||||
| 本篇完整清单(累积至 04-03,含 ConfigMap + Deployment + Service + Ingress) | [`ansible/files/nodejs-demo/04-03-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-03-nodejs-demo.yaml) |
|
||||
| Secret 示例(勿提交真密钥) | [`ansible/files/nodejs-demo/nodejs-demo-secret.example.yaml`](../ansible/files/nodejs-demo/nodejs-demo-secret.example.yaml) |
|
||||
| 手工应用 | `kubectl apply -f ansible/files/nodejs-demo/04-03-nodejs-demo.yaml` |
|
||||
| Ansible | `ansible-playbook ... -e nodejs_demo_manifest=04-03-nodejs-demo.yaml` |
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
- **普通配置**(提示文案、开关、非密钥连接串):用 **ConfigMap**;改完 `kubectl apply`,Pod 滚动后生效(是否自动重启取决于你怎么挂载/引用)。
|
||||
- **密钥类**:用 **Secret**;内容和 ConfigMap 类似,但要更严格管控权限与存储位置。
|
||||
- **在 Node 里怎么用**:和在本机设环境变量一样,例如 `NODE_ENV`、`PORT`、`NODE_OPTIONS`;启动命令怎么写见 [`04-02-nodejs-镜像与运行命令.md`](04-02-nodejs-镜像与运行命令.md)。
|
||||
|
||||
### 相对 `04-02` 的变更(原文 → 新文)
|
||||
|
||||
| 位置 | 原文(`04-02`) | 新文(`04-03`) |
|
||||
|------|-----------------|-----------------|
|
||||
| 新增资源 | (无) | `ConfigMap` `nodejs-demo-config`,`APP_MSG` |
|
||||
| `containers[].env` | (无) | `APP_MSG` 来自 `configMapKeyRef` |
|
||||
| `containers[].command` | `["node"]` + `args` 单行脚本 | `node` + 多行 `-e` 脚本,读 `process.env.APP_MSG`,仍监听 **3000** |
|
||||
|
||||
应用:
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nodejs-demo/04-03-nodejs-demo.yaml
|
||||
```
|
||||
|
||||
## 验证
|
||||
|
||||
```bash
|
||||
kubectl get cm nodejs-demo-config -n default -o yaml
|
||||
kubectl exec deploy/nodejs-demo -n default -- printenv APP_MSG
|
||||
curl -s --max-time 3 http://<节点IP>/node/
|
||||
```
|
||||
|
||||
## Secret 示例(仅示意)
|
||||
|
||||
**说明**:示例文件为 [`nodejs-demo-secret.example.yaml`](../ansible/files/nodejs-demo/nodejs-demo-secret.example.yaml);也可 `kubectl create secret generic ...`。在 Pod 中用 `env.valueFrom.secretKeyRef` 引用;验证 `printenv API_TOKEN`(注意日志勿打印密钥)。
|
||||
|
||||
## 删除
|
||||
|
||||
```bash
|
||||
kubectl delete configmap nodejs-demo-config -n default --ignore-not-found
|
||||
kubectl delete secret nodejs-demo-secret -n default --ignore-not-found
|
||||
```
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **CreateContainerConfigError**:引用的 ConfigMap/Secret 不存在或 key 名错误。
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [`04-09-nodejs-存储与卷.md`](04-09-nodejs-存储与卷.md)(文件挂载另一种注入方式)
|
||||
- [`04-12-nodejs-TLS与证书.md`](04-12-nodejs-TLS与证书.md)
|
||||
68
docs/04-04-nodejs-端口与Service.md
Normal file
68
docs/04-04-nodejs-端口与Service.md
Normal file
@@ -0,0 +1,68 @@
|
||||
# 04-04-nodejs-端口与Service
|
||||
|
||||
> 理清 **容器监听端口**、**Service 端口** 与 **Ingress backend 端口** 三者对应关系;在 `04-01` 基线上做最小调整。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已部署 `nodejs-demo`(`04-01`)。
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
| 项 | 路径 |
|
||||
|----|------|
|
||||
| 本篇完整清单(累积至 04-04) | [`ansible/files/nodejs-demo/04-04-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-04-nodejs-demo.yaml) |
|
||||
| 应用 | `kubectl apply -f ansible/files/nodejs-demo/04-04-nodejs-demo.yaml` |
|
||||
|
||||
自 **04-04** 起,累积清单中应用监听 **8080**(与 `04-01` 文档中的 3000 不同,便于与后续探针、分项对齐)。
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
可以把流量想成「访客 → Ingress → Service → 容器端口」:
|
||||
|
||||
| 位置 | 字段 | 04-01 里大概是啥 |
|
||||
|------|------|------------------|
|
||||
| 容器 | `containerPort` | 应用**实际监听**的端口,例:`3000` |
|
||||
| Service | `port` / `targetPort` | 集群内访问 Service 的 **`port`**,转发到 Pod 的 **`targetPort`** |
|
||||
| Ingress | `backend.service.port.number` | 填 Service 的 **`port`**(数字),**不是** `targetPort` 这个名字 |
|
||||
|
||||
**改应用监听端口时**:容器监听、`containerPort`、`targetPort` 要一致;Ingress 只要还指向 Service 的 `port: 80`,通常不用动。
|
||||
|
||||
### 相对 `04-03` 的变更(原文 → 新文)
|
||||
|
||||
| 位置 | 原文(`04-03`) | 新文(`04-04`) |
|
||||
|------|-----------------|-----------------|
|
||||
| 容器内监听 | `.listen(3000)` | `.listen(8080)` |
|
||||
| `containerPort` | `3000` | `8080` |
|
||||
| Service `targetPort` | `3000` | `8080` |
|
||||
| Ingress `backend.service.port.number` | `80` | `80`(不变,仍指 Service 的 `port`) |
|
||||
|
||||
## 多端口(示意)
|
||||
|
||||
**YAML 怎么接**:仍在 **`nodejs-demo.yaml`** 里改——容器多写几个 `containerPort`,Service 多写几条 `port`/`targetPort`;Ingress 一般只指向**对外那一个** Service 端口即可。
|
||||
|
||||
若同一 Pod 既要对外 HTTP,又要对内调试端口,就属于这种「多端口」场景。
|
||||
|
||||
## 部署与验证
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nodejs-demo/04-04-nodejs-demo.yaml
|
||||
kubectl get svc nodejs-demo -n default -o wide
|
||||
kubectl get endpoints nodejs-demo -n default
|
||||
curl -s --max-time 3 http://<节点IP>/node/
|
||||
```
|
||||
|
||||
预期:`endpoints` 有 Pod IP:targetPort;curl 正常。
|
||||
|
||||
## NodePort(可选)
|
||||
|
||||
`Service` 设 `type: NodePort` 可在节点上暴露高位端口调试;生产通常仍走 Ingress。K3s ServiceLB 行为见工作节点文档。
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **502 / 无 endpoints**:`targetPort` 与进程监听不一致;selector 与 Pod 标签不一致。
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [`04-10-nodejs-Ingress与Traefik.md`](04-10-nodejs-Ingress与Traefik.md)
|
||||
- [`04-01-k3s-nodejs-高级部署.md`](04-01-k3s-nodejs-高级部署.md)
|
||||
52
docs/04-05-nodejs-资源请求与限制.md
Normal file
52
docs/04-05-nodejs-资源请求与限制.md
Normal file
@@ -0,0 +1,52 @@
|
||||
# 04-05-nodejs-资源请求与限制
|
||||
|
||||
> 为 `nodejs-demo` 配置 `resources.requests` / `resources.limits`,便于调度与避免单个 Pod 占满节点;为后续 **HPA**([`04-13-nodejs-HPA.md`](04-13-nodejs-HPA.md))提供基础。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已部署 `nodejs-demo`(`04-01`)。
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
| 本篇完整清单 | [`ansible/files/nodejs-demo/04-05-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-05-nodejs-demo.yaml) |
|
||||
| 应用 | `kubectl apply -f ansible/files/nodejs-demo/04-05-nodejs-demo.yaml` |
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
- **requests(请求量)**:告诉调度器「我常态大概要多少 CPU/内存」,方便排布;**CPU 类 HPA** 也会用到(见 `04-13`)。
|
||||
- **limits(上限)**:内存超过上限,容器可能被 **OOM 杀掉**;CPU 超过上限会被 **限流**,变慢但不一定重启。
|
||||
- **Node 堆**:还可以用 `NODE_OPTIONS=--max-old-space-size=...` 限制 V8 堆,和容器内存 limit 配合用(见 `04-03`)。
|
||||
|
||||
### 相对 `04-04` 的变更(原文 → 新文)
|
||||
|
||||
| 位置 | 原文(`04-04`) | 新文(`04-05`) |
|
||||
|------|-----------------|-----------------|
|
||||
| `containers[].resources` | (无) | `requests` cpu 50m / memory 64Mi;`limits` cpu 500m / memory 256Mi |
|
||||
|
||||
## 部署与验证
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nodejs-demo/04-05-nodejs-demo.yaml
|
||||
kubectl describe pod -l app=nodejs-demo -n default | grep -A5 "Limits\|Requests"
|
||||
```
|
||||
|
||||
若集群已装 **metrics-server**:
|
||||
|
||||
```bash
|
||||
kubectl top pod -l app=nodejs-demo -n default
|
||||
```
|
||||
|
||||
## 删除限制
|
||||
|
||||
将 `resources` 整段删除或设为合理值后 `kubectl apply`。
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **OOMKilled**:提高 `limits.memory` 或降低 Node 堆/优化代码;`kubectl describe pod` 看 `Last State`。
|
||||
- **CPU 节流**:延迟升高;调大 `limits.cpu` 或优化热点。
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`(与网络无关时仍可先排除入口问题)
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [`04-13-nodejs-HPA.md`](04-13-nodejs-HPA.md)
|
||||
- [`04-06-nodejs-探针与健康检查.md`](04-06-nodejs-探针与健康检查.md)
|
||||
57
docs/04-06-nodejs-探针与健康检查.md
Normal file
57
docs/04-06-nodejs-探针与健康检查.md
Normal file
@@ -0,0 +1,57 @@
|
||||
# 04-06-nodejs-探针与健康检查
|
||||
|
||||
> 为 `nodejs-demo` 配置 **存活 / 就绪 / 启动** 探针,使 kubelet 能在异常时重启容器,并在未就绪时从 Service **Endpoints** 摘除流量。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已部署 `nodejs-demo`(`04-01`);应用需暴露可探测的 HTTP 路径(示例用根路径 `/`)。
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
| 本篇完整清单 | [`ansible/files/nodejs-demo/04-06-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-06-nodejs-demo.yaml) |
|
||||
| 应用 | `kubectl apply -f ansible/files/nodejs-demo/04-06-nodejs-demo.yaml` |
|
||||
|
||||
探针端口与累积清单一致,为 **8080**(自 `04-04` 起)。
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
Kubernetes 会**周期性访问**你指定的地址,判断容器该不该重启、该不该接流量:
|
||||
|
||||
| 探针 | 人话 |
|
||||
|------|------|
|
||||
| **存活 liveness** | 「还活着吗?」一直失败就**重启容器**(认为卡死)。 |
|
||||
| **就绪 readiness** | 「能接客了吗?」失败时**不放进 Service 负载均衡**(流量不打进来)。 |
|
||||
| **启动 startup** | 「是不是还在慢启动?」启动阶段先由它把关,避免被 liveness **误杀**。 |
|
||||
|
||||
### 相对 `04-05` 的变更(原文 → 新文)
|
||||
|
||||
| 位置 | 原文(`04-05`) | 新文(`04-06`) |
|
||||
|------|-----------------|-----------------|
|
||||
| `livenessProbe` / `readinessProbe` | (无) | `httpGet` 路径 `/`,端口 **8080**,`initialDelaySeconds`/`periodSeconds` 见清单文件 |
|
||||
|
||||
生产建议为健康检查单独提供 `/health` 或 `/ready` 路径(与业务路由分离)。
|
||||
|
||||
## 部署与验证
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nodejs-demo/04-06-nodejs-demo.yaml
|
||||
kubectl describe pod -l app=nodejs-demo -n default | sed -n '/Liveness/,/Events/p'
|
||||
kubectl get endpoints nodejs-demo -n default
|
||||
```
|
||||
|
||||
故意让应用崩溃或阻塞时,观察 **重启次数** 与 **Ready** 条件变化。
|
||||
|
||||
## TCP 探针(备选)
|
||||
|
||||
**YAML 怎么接**:与 HTTP 探针二选一;仍在 **Deployment 容器**里把 `httpGet` 换成 `tcpSocket: { port: 3000 }` 即可。适合「没有 HTTP、但端口能连上就算活着」的服务。
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **CrashLoopBackOff**:`livenessProbe` 过严或 `initialDelaySeconds` 过短;先放宽或加 `startupProbe`。
|
||||
- **Service 无流量但 Pod Running**:readiness 失败;`kubectl get ep` 为空地址。
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [`04-04-nodejs-端口与Service.md`](04-04-nodejs-端口与Service.md)
|
||||
- [`04-11-nodejs-副本与滚动发布.md`](04-11-nodejs-副本与滚动发布.md)
|
||||
56
docs/04-07-nodejs-调度与亲和.md
Normal file
56
docs/04-07-nodejs-调度与亲和.md
Normal file
@@ -0,0 +1,56 @@
|
||||
# 04-07-nodejs-调度与亲和
|
||||
|
||||
> 控制 `nodejs-demo` **落在哪些节点**:`nodeSelector`、`affinity`、`tolerations`。常用于与 Traefik、存储或合规区域对齐。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已部署 `nodejs-demo`(`04-01`);集群至少一个节点带可区分 **label**(例如 `kubectl get nodes --show-labels`)。
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
| 本篇完整清单 | [`ansible/files/nodejs-demo/04-07-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-07-nodejs-demo.yaml) |
|
||||
| 应用 | `kubectl apply -f ansible/files/nodejs-demo/04-07-nodejs-demo.yaml` |
|
||||
|
||||
清单中默认 `nodeSelector: kubernetes.io/hostname: ylc62`,请改为本集群节点名。
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
- **想让 Pod 只跑在某几台机器上**:给节点打标签,在 Pod 里写 **`nodeSelector`**,最简单。
|
||||
- **规则更复杂**(尽量分散、尽量和某类 Pod 同机架等):用 **affinity(亲和)**。
|
||||
- **节点有「污点」**:像「专属机器」,Pod 必须配置 **容忍污点(tolerations)** 才能调度上去。
|
||||
|
||||
### 相对 `04-06` 的变更(原文 → 新文)
|
||||
|
||||
| 位置 | 原文(`04-06`) | 新文(`04-07`) |
|
||||
|------|-----------------|-----------------|
|
||||
| `template.spec.nodeSelector` | (无) | `kubernetes.io/hostname: ylc62`(请按环境修改) |
|
||||
|
||||
仅当节点具备该标签键值时 Pod 才可调度;否则 **Pending**。
|
||||
|
||||
## 亲和性(示意,未写入默认累积清单)
|
||||
|
||||
**说明**:与 `nodeSelector` **不要硬混用冲突条件**;`affinity` / `tolerations` 语法可参考 Kubernetes 文档,在本地改清单或 `kubectl patch` 实验。(示例:`node-role.kubernetes.io/worker` + `Exists`。)
|
||||
|
||||
## 容忍污点 tolerations
|
||||
|
||||
若目标节点有 `taints`,需在 Pod 上配置对应 `tolerations`,否则无法调度。
|
||||
|
||||
## 部署与验证
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nodejs-demo/04-07-nodejs-demo.yaml
|
||||
kubectl get pod -l app=nodejs-demo -n default -o wide
|
||||
```
|
||||
|
||||
确认 **NODE** 列符合预期。
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **Pending**:`kubectl describe pod` 看 Events(`0/X nodes are available`);检查 selector/affinity/taint。
|
||||
- 与 Traefik DaemonSet 同节点时,注意主机端口与防火墙(见工作节点、OpenWrt HAProxy 文档)。
|
||||
|
||||
## 相关文档
|
||||
|
||||
- `01-02-k3s-工作节点.md`
|
||||
- `02-00-nginx-系列说明.md`(调度思路通用)
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
51
docs/04-08-nodejs-安全上下文.md
Normal file
51
docs/04-08-nodejs-安全上下文.md
Normal file
@@ -0,0 +1,51 @@
|
||||
# 04-08-nodejs-安全上下文
|
||||
|
||||
> 为 `nodejs-demo` 配置 **Pod / 容器级 `securityContext`**:非 root、只读根文件系统、降权能力等。**以集群 PSP/约束与实际镜像为准**,逐步收紧。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已部署 `nodejs-demo`(`04-01`)。
|
||||
- 注意:`node:18-alpine` 默认用户可能为 root;非 root 运行需镜像内已有可写目录或使用 `emptyDir` 挂载(见 [`04-09-nodejs-存储与卷.md`](04-09-nodejs-存储与卷.md))。
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
| 本篇完整清单 | [`ansible/files/nodejs-demo/04-08-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-08-nodejs-demo.yaml) |
|
||||
| 应用 | `kubectl apply -f ansible/files/nodejs-demo/04-08-nodejs-demo.yaml` |
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
- **降权**:用非 root 用户跑 Node,减少被攻击后的影响面。
|
||||
- **只读根盘**:系统目录不让写;应用要写临时文件,必须单独挂 **可写卷**(示例用 `/tmp` 的 `emptyDir`)。
|
||||
- **渐进收紧**:先在一个测试命名空间试,再推广;强策略集群可能被准入控制器拦截。
|
||||
|
||||
### 相对 `04-07` 的变更(原文 → 新文)
|
||||
|
||||
| 位置 | 原文(`04-07`) | 新文(`04-08`) |
|
||||
|------|-----------------|-----------------|
|
||||
| `template.spec.securityContext` | (无) | `fsGroup: 1000` |
|
||||
| `containers[].securityContext` | (无) | `runAsNonRoot` / `runAsUser: 1000` / `readOnlyRootFilesystem: true` 等 |
|
||||
| `volumeMounts` / `volumes` | 仅默认 | `emptyDir` 挂 `/tmp` |
|
||||
|
||||
若应用需写 `node_modules` 等,应改用多阶段构建把依赖打进镜像只读层,或挂卷到可写路径。
|
||||
|
||||
## 部署与验证
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nodejs-demo/04-08-nodejs-demo.yaml
|
||||
kubectl get pod -l app=nodejs-demo -n default
|
||||
kubectl exec deploy/nodejs-demo -n default -- id
|
||||
```
|
||||
|
||||
预期:Pod Running;`id` 显示非 root(与 `runAsUser` 一致)。
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **permission denied**:写只读路径;增加 `emptyDir`/`PVC` 挂载或放宽 `readOnlyRootFilesystem`。
|
||||
- **镜像必须以非 root UID 可运行**:部分官方镜像入口脚本要求 root,需换镜像或自定义 Dockerfile。
|
||||
- 集群 **Pod Security** / Kyverno 等策略拦截:读策略报错信息调整字段。
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [`04-09-nodejs-存储与卷.md`](04-09-nodejs-存储与卷.md)
|
||||
- [`04-05-nodejs-资源请求与限制.md`](04-05-nodejs-资源请求与限制.md)
|
||||
58
docs/04-09-nodejs-存储与卷.md
Normal file
58
docs/04-09-nodejs-存储与卷.md
Normal file
@@ -0,0 +1,58 @@
|
||||
# 04-09-nodejs-存储与卷
|
||||
|
||||
> 为 Node.js 工作负载挂载 **临时卷** 或 **持久卷(PVC)**:日志、上传目录、`/tmp`、只读配置目录等。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已部署 `nodejs-demo`(`04-01`)。
|
||||
- 持久化前请先完成存储类选型:`03-05-k3s-local-path-pvc.md`、`03-06-k3s-使用nfs存储.md`、`03-07-k3s-longhorn-持久化存储.md` 等。
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
| 本篇完整清单(含 PVC + `/data` 挂载,默认 `storageClassName: local-path`) | [`ansible/files/nodejs-demo/04-09-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-09-nodejs-demo.yaml) |
|
||||
| 应用 | `kubectl apply -f ansible/files/nodejs-demo/04-09-nodejs-demo.yaml` |
|
||||
|
||||
emptyDir、仅 ConfigMap 卷等变体可在该清单基础上自行删减 PVC 与 `volumeMounts` 做实验。
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
- **emptyDir**:Pod 删掉数据就没,像临时盘;适合缓存、`/tmp`。
|
||||
- **PVC**:数据由存储驱动落到盘里,Pod 重建还可能挂上同一块盘(取决于存储类型与访问模式)。
|
||||
- **ConfigMap 挂成文件**:适合「配置文件」形态,只读挂载很常见。
|
||||
|
||||
### 相对 `04-08` 的变更(原文 → 新文)
|
||||
|
||||
| 位置 | 原文(`04-08`) | 新文(`04-09`) |
|
||||
|------|-----------------|-----------------|
|
||||
| 资源列表 | 无 PVC | 新增 `PersistentVolumeClaim` `nodejs-demo-data` |
|
||||
| `volumeMounts` | 仅 `/tmp` | 增加 `/data` |
|
||||
| `volumes` | 仅 `tmp` emptyDir | 增加 `persistentVolumeClaim` |
|
||||
|
||||
**emptyDir 缓存卷**、**ConfigMap 只读挂载** 的片段写法见 Kubernetes 文档;可在 [`04-09-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-09-nodejs-demo.yaml) 上自行合并实验。
|
||||
|
||||
## 部署与验证
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nodejs-demo/04-09-nodejs-demo.yaml
|
||||
kubectl get pvc -n default
|
||||
kubectl exec deploy/nodejs-demo -n default -- df -h /data
|
||||
```
|
||||
|
||||
## 删除
|
||||
|
||||
```bash
|
||||
kubectl delete pvc nodejs-demo-data -n default
|
||||
```
|
||||
|
||||
(注意:是否删除底层 PV 数据取决于 reclaim 策略与驱动行为。)
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **Multi-Attach error**:RWO 卷被多个 Pod 同时挂载;改 `ReadWriteMany` 存储类或单副本。
|
||||
- **挂载失败**:PVC 未 Bound;StorageClass 不存在。
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [`04-03-nodejs-环境变量与配置注入.md`](04-03-nodejs-环境变量与配置注入.md)
|
||||
- [`04-08-nodejs-安全上下文.md`](04-08-nodejs-安全上下文.md)
|
||||
71
docs/04-10-nodejs-Ingress与Traefik.md
Normal file
71
docs/04-10-nodejs-Ingress与Traefik.md
Normal file
@@ -0,0 +1,71 @@
|
||||
# 04-10-nodejs-Ingress与Traefik
|
||||
|
||||
> 在 K3s 默认 **Traefik** 下,为 `nodejs-demo` 调整 **路径、主机名、入口点**;并了解标准 `Ingress` 与 **IngressRoute**(CRD)的差异入口。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已部署 `04-01` 中的 `Ingress`;可选:`03-01-k3s-traefik-dashboard.md` 观察路由。
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
| 本篇完整清单(含 Ingress `host` + `/api`) | [`ansible/files/nodejs-demo/04-10-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-10-nodejs-demo.yaml) |
|
||||
| 应用 | `kubectl apply -f ansible/files/nodejs-demo/04-10-nodejs-demo.yaml` |
|
||||
|
||||
`host` / `path` 可按环境修改清单;`curl` 用 IP 访问时需带 **`Host`** 头。
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
- **Ingress**:告诉 Traefik「哪个域名、哪条 URL 转到哪个 Service」。
|
||||
- **`router.entrypoints: web`**:走集群里 Traefik 的 **HTTP 入口**(名字一般是 `web`)。
|
||||
- **和 HTTPS 的关系**:要上证书、走 443,通常改用 **`websecure`**,见 [`04-12-nodejs-TLS与证书.md`](04-12-nodejs-TLS与证书.md)。
|
||||
|
||||
## 04-01 对照
|
||||
|
||||
- 注解 `traefik.ingress.kubernetes.io/router.entrypoints: web` 将路由绑定到 **HTTP** 入口(常见名 `web`)。
|
||||
- HTTPS 入口通常为 **`websecure`**,与 TLS 配合见 `04-12`。
|
||||
|
||||
### 相对 `04-09` 的变更(原文 → 新文)
|
||||
|
||||
| 位置 | 原文(`04-09`) | 新文(`04-10`) |
|
||||
|------|-----------------|-----------------|
|
||||
| Ingress `spec.rules` | 仅 `http.paths`,无 `host`,path `/node` | `host: app.example.local`,path **`/api`** |
|
||||
|
||||
**注意**:这与 `04-01` **只有 path、没有 host** 的写法不同;用 IP 访问必须带 **`Host: app.example.local`**。
|
||||
|
||||
## pathType 说明
|
||||
|
||||
- `Prefix`:前缀匹配(常用)。
|
||||
- `ImplementationSpecific`:由控制器解释;Traefik 有特定行为时需查官方文档。
|
||||
|
||||
## IngressRoute(CRD)
|
||||
|
||||
Traefik 原生 CRD 可做中间件、多规则组合等;集群需已安装对应 CRD。与标准 `Ingress` 二选一或并存时注意不要 **重复暴露同一路径** 导致冲突。
|
||||
|
||||
## 部署与验证
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nodejs-demo/04-10-nodejs-demo.yaml
|
||||
kubectl describe ing nodejs-demo -n default
|
||||
|
||||
# --- 情况 A:仍是 04-01 的 Ingress(无 rules.host,path=/node)---
|
||||
# 用节点 IP 直接访问即可,一般不需要 Host 头:
|
||||
curl -s -o /dev/null -w "%{http_code}\n" --max-time 3 http://<节点IP>/node/
|
||||
|
||||
# --- 情况 B:已换成上文示例(有 host=app.example.local,path=/api)---
|
||||
# 用 IP 访问时必须带 Host,且路径改为 /api(与 path 一致):
|
||||
curl -s -o /dev/null -w "%{http_code}\n" --max-time 3 \
|
||||
-H "Host: app.example.local" \
|
||||
"http://<节点IP>/api/"
|
||||
```
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **404**:路径/host 与规则不一致;Traefik 未加载该 Ingress(namespace、ingressClass)。
|
||||
- **502**:Service 无 Endpoints(见 `04-04`、`04-06`)。
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
- 集群级 Traefik:`03-01`、`03-02`
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [`04-12-nodejs-TLS与证书.md`](04-12-nodejs-TLS与证书.md)
|
||||
- [`04-04-nodejs-端口与Service.md`](04-04-nodejs-端口与Service.md)
|
||||
70
docs/04-11-nodejs-副本与滚动发布.md
Normal file
70
docs/04-11-nodejs-副本与滚动发布.md
Normal file
@@ -0,0 +1,70 @@
|
||||
# 04-11-nodejs-副本与滚动发布
|
||||
|
||||
> 调整 `nodejs-demo` 的 **副本数** 与 **滚动更新策略**,实现多实例与可控发布。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已部署 `nodejs-demo`(`04-01`)。
|
||||
- 多副本时应用须 **无状态** 或可共享会话;否则需粘性会话/外部会话存储(本文不展开)。
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
| 本篇完整清单 | [`ansible/files/nodejs-demo/04-11-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-11-nodejs-demo.yaml) |
|
||||
| 应用 | `kubectl apply -f ansible/files/nodejs-demo/04-11-nodejs-demo.yaml` |
|
||||
|
||||
`replicas` 与 `strategy` 在 **Deployment.spec** 下,与 `selector` / `template` 同级。
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
- **多副本**:同样应用跑多份,一台挂了别的还能接客;配合 Service 做负载均衡。
|
||||
- **滚动发布**:换新版本时**一个一个 Pod 换**,而不是全停再起(可通过 `maxSurge` / `maxUnavailable` 调「多激进」)。
|
||||
|
||||
### 相对 `04-10` 的变更(原文 → 新文)
|
||||
|
||||
| 位置 | 原文(`04-10`) | 新文(`04-11`) |
|
||||
|------|-----------------|-----------------|
|
||||
| `spec.replicas` | `1` | `3` |
|
||||
| `spec.strategy` | (默认 RollingUpdate) | 显式 `RollingUpdate`,`maxSurge: 1`,`maxUnavailable: 0` |
|
||||
|
||||
验证:
|
||||
|
||||
```bash
|
||||
kubectl get deploy nodejs-demo -n default
|
||||
kubectl get pod -l app=nodejs-demo -n default -o wide
|
||||
```
|
||||
|
||||
- **`maxUnavailable: 0`**:发布时先起新 Pod 再摘旧 Pod,适合要求 **不中断** 的场景(需足够资源 surge)。
|
||||
- 资源紧张时可适当允许 `maxUnavailable: 1`。
|
||||
|
||||
## 发布新版本
|
||||
|
||||
```bash
|
||||
kubectl set image deployment/nodejs-demo nodejs-demo=node:20-alpine -n default
|
||||
kubectl rollout status deployment/nodejs-demo -n default
|
||||
```
|
||||
|
||||
## 回滚
|
||||
|
||||
```bash
|
||||
kubectl rollout undo deployment/nodejs-demo -n default
|
||||
kubectl rollout history deployment/nodejs-demo -n default
|
||||
```
|
||||
|
||||
## 验证
|
||||
|
||||
```bash
|
||||
curl -s --max-time 3 -H "Host: app.example.local" "http://<节点IP>/api/"
|
||||
```
|
||||
|
||||
多次请求应看到多 Pod 分担(若 Service 为 ClusterIP + Ingress,由 kube-proxy/Traefik 负载)。
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **一直滚动中**:新 Pod 未 Ready(探针、镜像拉取);`kubectl describe deploy` / `kubectl get rs`。
|
||||
- **会话漂移**:多副本下登录态不一致为应用架构问题。
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [`04-06-nodejs-探针与健康检查.md`](04-06-nodejs-探针与健康检查.md)
|
||||
- [`04-13-nodejs-HPA.md`](04-13-nodejs-HPA.md)
|
||||
63
docs/04-12-nodejs-TLS与证书.md
Normal file
63
docs/04-12-nodejs-TLS与证书.md
Normal file
@@ -0,0 +1,63 @@
|
||||
# 04-12-nodejs-TLS与证书
|
||||
|
||||
> 为 `nodejs-demo` 的 **Ingress** 启用 **HTTPS**:`spec.tls` + 证书 **Secret**。集群侧 Traefik **ACME 自动证书** 以 [`03-02-k3s-traefik-acme.md`](03-02-k3s-traefik-acme.md) 为主;本篇侧重 **应用 Ingress 如何声明 TLS** 与验证。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 已完成 `03-02`(推荐):Traefik 已配置 `websecure` 与证书解析器;或你已手动/其他方式准备好 TLS Secret。
|
||||
- 已能 **从客户端访问** 到 Traefik 的 443(或你环境中的 HTTPS 入口)。
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
| 本篇完整清单(Ingress 已切 **websecure** + `spec.tls`;**不含** Secret 内容) | [`ansible/files/nodejs-demo/04-12-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-12-nodejs-demo.yaml) |
|
||||
| 应用 | 先创建 TLS Secret(见下),再 `kubectl apply -f ansible/files/nodejs-demo/04-12-nodejs-demo.yaml` |
|
||||
|
||||
**证书 Secret**:使用命令创建(不提交私钥到 Git):
|
||||
|
||||
```bash
|
||||
kubectl create secret tls nodejs-demo-tls \
|
||||
--cert=path/to/fullchain.pem \
|
||||
--key=path/to/privkey.pem \
|
||||
-n default
|
||||
```
|
||||
|
||||
`spec.tls.hosts` / `rules.host` 须与证书 SAN 一致(清单默认为 **app.example.local**)。
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
- **集群怎么收 HTTPS**:多半由 Traefik 终结 TLS(`03-02`);应用侧要在 Ingress 上声明「用哪个证书、哪个域名」。
|
||||
- **Secret 里有什么**:通常是 **`tls.crt`(完整链)** + **`tls.key`(私钥)**,类型为 `kubernetes.io/tls`。
|
||||
|
||||
### 相对 `04-11` 的变更(原文 → 新文)
|
||||
|
||||
| 位置 | 原文(`04-11`) | 新文(`04-12`) |
|
||||
|------|-----------------|-----------------|
|
||||
| Ingress 注解 `router.entrypoints` | `web` | `websecure` |
|
||||
| Ingress `spec.tls` | (无) | `hosts: [app.example.local]`,`secretName: nodejs-demo-tls` |
|
||||
|
||||
若使用 **ACME**,Secret 可能由 cert-manager 或 Traefik 自动生成;此时 `secretName` 填控制器生成的名称即可。
|
||||
|
||||
## 验证
|
||||
|
||||
```bash
|
||||
kubectl describe ing nodejs-demo -n default
|
||||
curl -vk --max-time 5 https://app.example.local/api/
|
||||
```
|
||||
|
||||
预期:TLS 握手成功;证书 CN/SAN 与域名匹配(自签或测试环境会有告警属正常)。
|
||||
|
||||
## HTTP 重定向
|
||||
|
||||
可由 Traefik 全局中间件或 Ingress 注解实现 `web` → `websecure`;细节见 `03-02` / Traefik 文档。
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **证书不匹配**:Secret 中证书与 `rules.host` 不一致。
|
||||
- **default backend 404**:`router.entrypoints` 与 Traefik 监听不一致;或仍走 `web` 访问 HTTPS 路径。
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
- 集群证书:`03-02-k3s-traefik-acme.md`、`03-03-k3s-traefik-dashboard-acme.md`
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [`04-10-nodejs-Ingress与Traefik.md`](04-10-nodejs-Ingress与Traefik.md)
|
||||
- [`04-01-k3s-nodejs-高级部署.md`](04-01-k3s-nodejs-高级部署.md)
|
||||
53
docs/04-13-nodejs-HPA.md
Normal file
53
docs/04-13-nodejs-HPA.md
Normal file
@@ -0,0 +1,53 @@
|
||||
# 04-13-nodejs-HPA
|
||||
|
||||
> 为 `nodejs-demo` 配置 **HorizontalPodAutoscaler**,按 CPU/内存等指标在 `minReplicas`~`maxReplicas` 间自动伸缩。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 集群已安装 **metrics-server**(K3s 常默认启用;`kubectl top nodes` 可用即基本就绪)。
|
||||
- Deployment 已配置 **`resources.requests`**(CPU 指标 HPA 依赖 requests),见 [`04-05-nodejs-资源请求与限制.md`](04-05-nodejs-资源请求与限制.md)。
|
||||
- 建议已配置 **readinessProbe**([`04-06-nodejs-探针与健康检查.md`](04-06-nodejs-探针与健康检查.md)),避免扩容出未就绪 Pod。
|
||||
|
||||
## 清单路径(唯一真源)
|
||||
|
||||
| 本篇完整清单(含 Deployment/Service/Ingress/PVC/CM + **HPA**) | [`ansible/files/nodejs-demo/04-13-nodejs-demo.yaml`](../ansible/files/nodejs-demo/04-13-nodejs-demo.yaml) |
|
||||
| 应用 | `kubectl apply -f ansible/files/nodejs-demo/04-13-nodejs-demo.yaml`(若用 `04-12`,需先有 TLS Secret) |
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
- **HPA**:根据 CPU/内存等指标,**自动加减 Pod 个数**(在 `minReplicas`~`maxReplicas` 之间)。
|
||||
- **为什么要配 requests**:否则集群算不出「利用率百分比」,CPU 型 HPA 往往**不工作**。
|
||||
|
||||
### 相对 `04-12` 的变更(原文 → 新文)
|
||||
|
||||
| 位置 | 原文(`04-12`) | 新文(`04-13`) |
|
||||
|------|-----------------|-----------------|
|
||||
| 资源 | 无 HPA | 新增 `HorizontalPodAutoscaler` `nodejs-demo`,CPU 目标利用率 50%,`minReplicas: 1`,`maxReplicas: 5` |
|
||||
|
||||
## 部署与验证
|
||||
|
||||
```bash
|
||||
kubectl apply -f ansible/files/nodejs-demo/04-13-nodejs-demo.yaml
|
||||
kubectl get hpa -n default
|
||||
kubectl describe hpa nodejs-demo -n default
|
||||
```
|
||||
|
||||
加压 Pod(例如在集群内对 Service 做短时压测)后观察 **REPLICAS** 是否上升;降压后是否回落(受 `--default-downscale-stabilization` 等控制器参数影响,可能有延迟)。
|
||||
|
||||
## 删除 HPA
|
||||
|
||||
```bash
|
||||
kubectl delete hpa nodejs-demo -n default
|
||||
```
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **unknown / missing metrics**:metrics-server 未运行;Pod 无 `resources.requests`。
|
||||
- **不扩容**:当前利用率未达阈值;或 `maxReplicas` 已达上限。
|
||||
- **频繁抖动**:调高目标利用率或配置 behavior(`scaleDown`/`scaleUp` stabilizationWindow)。
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [`04-11-nodejs-副本与滚动发布.md`](04-11-nodejs-副本与滚动发布.md)
|
||||
- [`05-05-prometheus与grafana.md`](05-05-prometheus与grafana.md)(自定义 metrics 进阶,本文不展开)
|
||||
73
docs/04-14-nodejs-GitOps与CI流水线.md
Normal file
73
docs/04-14-nodejs-GitOps与CI流水线.md
Normal file
@@ -0,0 +1,73 @@
|
||||
# 04-14-nodejs-GitOps与CI流水线
|
||||
|
||||
> 从 **Node.js 应用仓库** 视角串联:**持续集成(CI)** 构建镜像并推送仓库,**持续交付** 通过 **GitOps** 或流水线步骤把声明式清单下发到 K3s。细节以仓库内 GitLab/GitOps 文档为准,本篇给 **最小闭环与引用**。
|
||||
|
||||
## 前置条件
|
||||
|
||||
- 集群可拉取镜像(私有仓库需 `imagePullSecrets`,见 `04-02` 相关说明)。
|
||||
- 若使用 GitLab:`05-03-k3s-安装gitlab-含runner.md`、`05-04-k3s-配置gitlab-cicd.md`。
|
||||
|
||||
## 清单与仓库(唯一真源)
|
||||
|
||||
- **本文无独立流水线 YAML**(GitLab CI、Argo CD、Flux 随版本变化大);流程见 **`05-04`**、**`03-09`**。
|
||||
- **应用清单真源**:[`ansible/files/nodejs-demo/`](../ansible/files/nodejs-demo/)(例如 `04-01-nodejs-demo.yaml`)。将 **该目录或单文件** 纳入 Git,由 CI 改 `image:` tag 或由 GitOps 同步到集群。
|
||||
|
||||
## 场景说明(白话)
|
||||
|
||||
- **CI(持续集成)**:你 `git push` 之后,自动 **测试、打镜像、推到镜像仓库**。
|
||||
- **CD / GitOps**:要么流水线里 `kubectl apply`,要么让 **Argo CD / Flux** 盯着 Git,**集群状态跟仓库对齐**。
|
||||
- **和 `04-01` 的关系**:`04-01` 是「手写能跑」的最小示例;上流水线后,只是把同样这些 YAML **改成自动化维护**。
|
||||
|
||||
## CI 典型阶段(概念)
|
||||
|
||||
```mermaid
|
||||
flowchart LR
|
||||
A[git push] --> B[lint test build]
|
||||
B --> C[docker build]
|
||||
C --> D[push registry]
|
||||
D --> E[update manifest tag]
|
||||
E --> F[deploy to cluster]
|
||||
```
|
||||
|
||||
1. **源码**:`Dockerfile` 多阶段构建(`node:xx` 构建 → `distroless`/`alpine` 运行)。
|
||||
2. **镜像 tag**:推荐 **commit SHA** 或 **semver**,避免一律 `latest`。
|
||||
3. **推送**:`docker push registry.example.com/app:tag`。
|
||||
4. **更新清单**:修改 Deployment 的 `image:` 或 Kustomize overlay / Helm `values`。
|
||||
|
||||
GitLab CI 示例结构与 Runner 注册见 **`05-04`**。
|
||||
|
||||
## 下发到集群的两种方式
|
||||
|
||||
| 方式 | 说明 |
|
||||
|------|------|
|
||||
| **流水线内 kubectl** | CI Job 使用 `KUBECONFIG` 或 in-cluster ServiceAccount `kubectl apply`;简单,密钥管理要求高。 |
|
||||
| **GitOps** | 仓库仅存 YAML/Helm;**Argo CD / Flux** 监听 Git 自动同步集群;见 **`03-09-k3s-gitops-集群配置管理.md`**。 |
|
||||
|
||||
## GitOps 最小路径(Flux / Argo CD 通用思路)
|
||||
|
||||
1. 清单仓库与镜像仓库分离或同库不同目录。
|
||||
2. 集群内安装 GitOps 控制器,指向 **Git 分支 + 路径**。
|
||||
3. CI 只负责 **构建推送镜像 + 提 PR 改 image tag**(或机器人提交),合并后由控制器 **拉取并 apply**。
|
||||
|
||||
## 与 04-01 的关系
|
||||
|
||||
- `04-01` 的 `nodejs-demo.yaml` 可作为 **GitOps 仓库中的基准清单**;CI 只替换 `image:` 与 `metadata.labels.version` 等字段。
|
||||
|
||||
## 验证(流程级)
|
||||
|
||||
- CI:Pipeline 成功、镜像在 registry 可见。
|
||||
- 集群:`kubectl get deploy -n <ns> -o jsonpath='{.items[*].spec.template.spec.containers[*].image}'` 与预期 tag 一致。
|
||||
- 应用:`curl` Ingress 路径与 `04-01` 验证方式相同。
|
||||
|
||||
## 失败排查
|
||||
|
||||
- **ImagePullBackOff**:tag 错误、未认证 registry、节点网络限制。
|
||||
- **GitOps 不同步**:分支/path 配置错误、RBAC 不足、CRD 冲突。
|
||||
- `06-01-k3s-networkpolicy-故障排查.md`(部署后服务不可达时)
|
||||
|
||||
## 相关文档
|
||||
|
||||
- [`05-03-k3s-安装gitlab-含runner.md`](05-03-k3s-安装gitlab-含runner.md)
|
||||
- [`05-04-k3s-配置gitlab-cicd.md`](05-04-k3s-配置gitlab-cicd.md)
|
||||
- [`03-09-k3s-gitops-集群配置管理.md`](03-09-k3s-gitops-集群配置管理.md)
|
||||
- [`04-02-nodejs-镜像与运行命令.md`](04-02-nodejs-镜像与运行命令.md)
|
||||
36
docs/05-01-k3s-部署homer首页面板.md
Normal file
36
docs/05-01-k3s-部署homer首页面板.md
Normal file
@@ -0,0 +1,36 @@
|
||||
# 05-01-k3s 部署 Homer 首页面板
|
||||
|
||||
> 在 K3s 中部署 Homer,作为家庭实验室的统一导航页。
|
||||
|
||||
---
|
||||
|
||||
## 部署思路
|
||||
|
||||
- Homer 作为普通 Web 应用运行在 K3s
|
||||
- 通过 Traefik 暴露域名(例如 `home.example.com`)
|
||||
|
||||
---
|
||||
|
||||
## 快速部署
|
||||
|
||||
```bash
|
||||
kubectl create ns homer
|
||||
kubectl apply -f ansible/files/homer/homer.yaml
|
||||
```
|
||||
|
||||
**唯一真源**:[`ansible/files/homer/homer.yaml`](../ansible/files/homer/homer.yaml)(Deployment + Service + Ingress;按需改 `host`)。
|
||||
|
||||
---
|
||||
|
||||
## 验证
|
||||
|
||||
```bash
|
||||
kubectl -n homer get pod,svc,ing -o wide
|
||||
curl -I --max-time 3 http://192.168.2.61/
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 下一步
|
||||
|
||||
- `05-02-onenav首页面板.md`
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user