feat: 按 doc_id 重组 ansible/files 与验证框架

- ansible/files 改为与文档 XX-YY 对齐的目录结构,更新相关 playbook 路径
- 新增 scripts/verify.sh 与 ansible/playbooks/verify/*.yml,移除单体 verify-matrix.yml
- 补充 docs/00-02 矩阵状态、00-05 验证框架与流程、00-04 环境与 ylc65 工作机说明
- 增加 k3s 存储准备、Longhorn、local-path 等 playbook 与辅助脚本

Made-with: Cursor
This commit is contained in:
2026-03-26 07:01:14 +08:00
parent a67788de56
commit 8c43761962
192 changed files with 4006 additions and 320 deletions

2
.gitignore vendored
View File

@@ -1,5 +1,7 @@
.cursor
.ssh
# 本地填写的验证编排环境变量(从 scripts/.env.verify.example 复制)
scripts/.env.verify
_bmad
_bmad-output
design-artifacts

View File

@@ -11,13 +11,14 @@
- 部署环境说明:`docs/00-04-部署环境说明.md`节点布局、IP、版本等
- 脚本主入口:`scripts/README.md`
- 验证状态一览:`docs/00-02-验证矩阵.md`
- 测试与验证框架设计:`docs/00-05-测试与验证框架.md`
简单理解这三份入口的分工:
- `README.md`:新手入口,看“要做什么、按什么顺序做”;
- `00-00-构建总览.md`:文档导航,看“下一步该看哪一篇”;
- `00-01-k3s-基础概念.md`:概念速查,看“不懂的 K3s/Traefik/NetworkPolicy 术语”;
- `00-02-验证矩阵.md`:状态面板,看“哪些文档已经在真实环境跑通过”。
- `00-02-验证矩阵.md`:状态面板,看“哪些文档已经在真实环境跑通过”(当前以手工验证为准)
目录约定很简单:

View File

@@ -0,0 +1,9 @@
# 00-01-k3s-基础概念(占位)
对应文档:[`docs/00-01-k3s-基础概念.md`](../../docs/00-01-k3s-基础概念.md)
## 说明
- 本篇为概念性文档,**不提供可部署的 Kubernetes 清单**。
- 验证方式:按文档理解与对照集群实际输出即可(无 `kubectl apply -f` 目标)。

View File

@@ -0,0 +1,9 @@
# 00-04-部署环境说明(占位)
对应文档:[`docs/00-04-部署环境说明.md`](../../docs/00-04-部署环境说明.md)
## 说明
- 本篇为环境说明文档,**不提供可部署的 Kubernetes 清单**。
- 验证方式:按文档逐项核对你的实际环境信息(节点、磁盘挂载、版本等)。

View File

@@ -0,0 +1,13 @@
# 01-01-k3s-控制节点含traefik占位
对应文档:[`docs/01-01-k3s-控制节点含traefik.md`](../../docs/01-01-k3s-控制节点含traefik.md)
## 说明
- 本篇主要是 **K3s 安装与集群初始化**,核心部署逻辑在 Ansible playbook 中。
- 本目录仅作为 doc_id 对齐占位;不单独维护 K8s manifests。
## 关联(参考)
- Ansible`ansible/playbooks/k3s-init-and-install.yml`

View File

@@ -0,0 +1,13 @@
# 01-02-k3s-工作节点(占位)
对应文档:[`docs/01-02-k3s-工作节点.md`](../../docs/01-02-k3s-工作节点.md)
## 说明
- 本篇主要是 **工作节点加入 K3s 集群** 与节点侧配置。
- 本目录仅作为 doc_id 对齐占位;不单独维护 K8s manifests。
## 关联(参考)
- Ansible`ansible/playbooks/k3s-init-and-install.yml`

View File

@@ -0,0 +1,9 @@
# 01-03-armv7-standalone-docker占位
对应文档:[`docs/01-03-armv7-standalone-docker.md`](../../docs/01-03-armv7-standalone-docker.md)
## 说明
- 本篇为 armv7 设备的 Docker 独立部署说明,**不提供 K3s/Kubernetes 清单**。
- 本目录仅用于 doc_id 对齐占位。

View File

@@ -0,0 +1,9 @@
# 01-04-双控制节点ha占位
对应文档:[`docs/01-04-双控制节点ha.md`](../../docs/01-04-双控制节点ha.md)
## 说明
- 本篇为 HA/双控制节点方案说明,部署更多依赖集群架构与外部 LB 配置。
- 本目录仅用于 doc_id 对齐占位;不提供独立 K8s manifests。

View File

@@ -0,0 +1,9 @@
# 01-05-armv7-nfs服务安装占位
对应文档:[`docs/01-05-armv7-nfs服务安装.md`](../../docs/01-05-armv7-nfs服务安装.md)
## 说明
- 本篇为 armv7 设备上 NFS 服务安装说明,**不提供 K3s/Kubernetes 清单**。
- 本目录仅用于 doc_id 对齐占位。

View File

@@ -0,0 +1,13 @@
# 01-06-节点初始化-ansible-实践(占位)
对应文档:[`docs/01-06-节点初始化-ansible-实践.md`](../../docs/01-06-节点初始化-ansible-实践.md)
## 说明
- 本篇的“真源”是 Ansible playbooks初始化、安装、验证
- 本目录仅用于 doc_id 对齐占位;不单独维护 K8s manifests。
## 关联(参考)
- Ansible`ansible/playbooks/k3s-init-and-install.yml`

View File

@@ -0,0 +1,12 @@
# 02-00-nginx-系列说明(占位)
对应文档:[`docs/02-00-nginx-系列说明.md`](../../docs/02-00-nginx-系列说明.md)
## 清单复用说明
本系列02-0102-04的可部署清单统一收敛在
- `ansible/files/02-05-nginx-matrix/`
本目录仅用于 doc_id 对齐占位。

View File

@@ -0,0 +1,15 @@
# 02-01-nginx-control-ingress占位
对应文档:[`docs/02-01-nginx-control-ingress.md`](../../docs/02-01-nginx-control-ingress.md)
## 真源清单
- 复用清单目录:`ansible/files/02-05-nginx-matrix/`
- 对应文件:`01-control-ingress.yaml`
应用示例:
```bash
kubectl apply -f ansible/files/02-05-nginx-matrix/01-control-ingress.yaml
```

View File

@@ -0,0 +1,15 @@
# 02-02-nginx-control-ingressroute占位
对应文档:[`docs/02-02-nginx-control-ingressroute.md`](../../docs/02-02-nginx-control-ingressroute.md)
## 真源清单
- 复用清单目录:`ansible/files/02-05-nginx-matrix/`
- 对应文件:`02-control-ingressroute.yaml`
应用示例:
```bash
kubectl apply -f ansible/files/02-05-nginx-matrix/02-control-ingressroute.yaml
```

View File

@@ -0,0 +1,15 @@
# 02-03-nginx-worker-ingress占位
对应文档:[`docs/02-03-nginx-worker-ingress.md`](../../docs/02-03-nginx-worker-ingress.md)
## 真源清单
- 复用清单目录:`ansible/files/02-05-nginx-matrix/`
- 对应文件:`03-worker-ingress.yaml`
应用示例:
```bash
kubectl apply -f ansible/files/02-05-nginx-matrix/03-worker-ingress.yaml
```

View File

@@ -0,0 +1,15 @@
# 02-04-nginx-worker-ingressroute占位
对应文档:[`docs/02-04-nginx-worker-ingressroute.md`](../../docs/02-04-nginx-worker-ingressroute.md)
## 真源清单
- 复用清单目录:`ansible/files/02-05-nginx-matrix/`
- 对应文件:`04-worker-ingressroute.yaml`
应用示例:
```bash
kubectl apply -f ansible/files/02-05-nginx-matrix/04-worker-ingressroute.yaml
```

View File

@@ -98,3 +98,4 @@ spec: # Ingress 规则
name: nginx-m1 # Service 名
port: # Service 端口
number: 80 # 端口号

View File

@@ -92,3 +92,4 @@ spec: # 路由规则
services: # 匹配后转发的服务
- name: nginx-m2 # 后端 Service 名称
port: 80 # 后端 Service 端口

View File

@@ -94,3 +94,4 @@ spec: # Ingress 规则
name: nginx-m3 # Service 名称
port: # 后端端口
number: 80 # 端口号

View File

@@ -92,3 +92,4 @@ spec: # IngressRoute 规则
services: # 后端服务列表
- name: nginx-m4 # Service 名称
port: 80 # Service 端口

View File

@@ -10,3 +10,4 @@
| 04-worker-ingressroute.yaml | M4 工作+IngressRoute | /demo-m4 | nodeSelector=ylc64 |
M4 默认指定 ylc64M3 随机工作节点;按实际修改。

View File

@@ -35,3 +35,4 @@ spec: # 路由规则
services: # 匹配到后转发给的服务列表
- name: api@internal # Traefik 内置 API 服务
kind: TraefikService # 该服务的 CRD 类型

View File

@@ -113,3 +113,4 @@ spec: # Ingress 规则
name: nginx-m1 # 共用 Service
port: # 后端端口
number: 80 # 端口号

View File

@@ -96,3 +96,4 @@ spec: # 规则
services: # 后端服务
- name: nginx-m2 # 后端 Service
port: 80 # 端口

View File

@@ -108,3 +108,4 @@ spec: # Ingress 规则
name: nginx-m3 # 后端 Service 名称
port: # 后端端口
number: 80 # 端口号

View File

@@ -96,3 +96,4 @@ spec: # 规则
services: # 后端服务
- name: nginx-m4 # 后端 Service 名称
port: 80 # 后端端口

View File

@@ -38,3 +38,4 @@ spec: # chart 注入配置的具体内容
nodeSelector: # 把 Traefik Pod 固定到指定节点(配合 RWO 本地存储更安全)
kubernetes.io/hostname: ylc61 # 固定节点主机名(按你的实际节点修改)

View File

@@ -92,3 +92,4 @@ spec: # Ingress 规则
name: tomcat-test05 # Service 名称
port: # Service 端口
number: 8080 # 端口号

View File

@@ -0,0 +1,74 @@
# 03-03 Traefik Dashboard + ACME合并版 HelmChartConfig
# 说明:同一 chart 只能有一份 HelmChartConfigname: traefik所以 Dashboard 与 ACME 必须合并。
# 使用前:替换 <YOUR_REAL_EMAIL>;创建 cloudflare-api-token Secret按实际修改 nodeSelector/trustedIPs/hosts。
---
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: traefik
namespace: kube-system
spec:
valuesContent: |-
ports:
web:
expose: true
websecure:
expose: true
traefik:
expose: true
additionalArguments:
# Dashboard
- "--api.dashboard=true"
- "--api.insecure=true"
# ACMECloudflare DNS-01
- "--certificatesresolvers.cloudflare.acme.dnschallenge.resolvers=1.1.1.1:53,1.0.0.1:53"
- "--certificatesresolvers.cloudflare.acme.email=<YOUR_REAL_EMAIL>"
- "--certificatesresolvers.cloudflare.acme.storage=/data/acme.json"
# - "--certificatesresolvers.cloudflare.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory"
- "--certificatesresolvers.cloudflare.acme.dnschallenge.provider=cloudflare"
- "--certificatesresolvers.cloudflare.acme.dnschallenge.propagation.delayBeforeChecks=600"
# 健康检查:/ping 走 443给 HAProxy https httpchk 用)
- "--ping=true"
- "--ping.entryPoint=websecure"
# PROXY protocolHAProxy 前置时需要)
- "--entrypoints.web.proxyProtocol.trustedIPs=192.168.2.0/24"
- "--entrypoints.websecure.proxyProtocol.trustedIPs=192.168.2.0/24"
env:
- name: CF_DNS_API_TOKEN
valueFrom:
secretKeyRef:
name: cloudflare-api-token
key: api-token
nodeSelector:
kubernetes.io/hostname: ylc61
# persistence将 /data 持久化local-path PVC保证 acme.json 落盘
persistence:
enabled: true
name: data
accessMode: ReadWriteOnce
size: 128Mi
path: /data
---
apiVersion: traefik.io/v1alpha1
kind: IngressRoute
metadata:
name: traefik-dashboard
namespace: kube-system
spec:
entryPoints:
- web
routes:
- match: PathPrefix(`/dashboard`) || PathPrefix(`/api`)
kind: Rule
services:
- name: api@internal
kind: TraefikService

View File

@@ -35,3 +35,4 @@ spec: # Deployment 期望状态
secretKeyRef: # 从 Secret 的 key 取值
name: cloudflared-credentials # Secret 名称
key: TUNNEL_TOKEN # Secret 中的 key

View File

@@ -0,0 +1,9 @@
{
"nodePathMap": [
{
"node": "DEFAULT_PATH_FOR_NON_LISTED_NODES",
"paths": ["/storage/storage"]
}
]
}

View File

@@ -39,3 +39,4 @@ spec: # Deployment 的期望状态副本数、选择器、Pod 模板等)
- name: data # Pod 内的卷名(给 volumeMounts 用)
persistentVolumeClaim: # 使用 PVC 作为卷来源
claimName: local-pvc-demo # 绑定到哪个 PVC必须与上面 PVC metadata.name 且同 namespace

View File

@@ -10,8 +10,8 @@ spec: # PV 规格
- ReadWriteMany # RWX多节点可读写
persistentVolumeReclaimPolicy: Retain # 回收策略:删除 PVC 后保留底层数据
nfs: # 存储后端NFS
server: 192.168.2.22 # NFS 服务器地址
path: /data/nfs # NFS 导出目录
server: <NFS_SERVER_IP> # NFS 服务器地址示例192.168.2.22;应用前必须替换)
path: <NFS_EXPORT_PATH> # NFS 导出目录(示例:/sdcard应用前必须替换
---
apiVersion: v1 # PVC 使用的核心 API 版本
kind: PersistentVolumeClaim # 资源类型:持久卷声明
@@ -25,3 +25,4 @@ spec: # PVC 规格
requests: # 配额请求
storage: 5Gi # 申请容量
volumeName: nfs-pv-demo # 显式绑定到指定 PV

View File

@@ -0,0 +1,14 @@
# Longhorn Helm values — 本仓库实验室(四节点 10G+32G/storage 独立挂载)
# Charthttps://charts.longhorn.io 字段说明见官方 chart values.yaml与 app 同版本 tag
# 使用helm upgrade --install longhorn longhorn/longhorn -n longhorn-system --create-namespace -f values-lab.yaml --version 1.7.2
defaultSettings:
defaultDataPath: /storage/longhorn
# 字符串形式,与 chart 一致32G 数据盘实验环境先省空间,要演练 HA 可改为 "2" 或 "3"
defaultReplicaCount: "1"
persistence:
defaultClass: true
defaultClassReplicaCount: 1
defaultFsType: ext4

View File

@@ -0,0 +1,9 @@
# 03-08-k3s-ha-集群配置与切换(占位)
对应文档:[`docs/03-08-k3s-ha-集群配置与切换.md`](../../docs/03-08-k3s-ha-集群配置与切换.md)
## 说明
- 本篇偏架构/流程与配置项梳理,具体落地会涉及多节点与外部组件(如 LB/DNS/证书)。
- 本目录仅用于 doc_id 对齐占位;暂无独立可复用 manifests。

View File

@@ -0,0 +1,9 @@
# 03-09-k3s-gitops-集群配置管理(占位)
对应文档:[`docs/03-09-k3s-gitops-集群配置管理.md`](../../docs/03-09-k3s-gitops-集群配置管理.md)
## 说明
- 本篇为 GitOps 框架草案Argo CD / Flux 等),最终 manifests 取决于选型与版本。
- 本目录仅用于 doc_id 对齐占位;暂无固定清单。

View File

@@ -52,3 +52,4 @@ spec: # Ingress 规则
name: nodejs-demo # Service 名称
port: # Service 端口
number: 80 # 端口号

View File

@@ -56,3 +56,4 @@ spec: # Ingress 规则
name: nodejs-demo # Service 名称
port: # Service 端口
number: 80 # 端口号

View File

@@ -73,3 +73,4 @@ spec: # Ingress 规则
name: nodejs-demo # Service 名称
port: # Service 端口
number: 80 # 端口号

View File

@@ -73,3 +73,4 @@ spec: # Ingress 规则
name: nodejs-demo # Service 名称
port: # Service 端口
number: 80 # 端口号

View File

@@ -80,3 +80,4 @@ spec: # Ingress 规则
name: nodejs-demo # Service 名称
port: # Service 端口
number: 80 # 端口号

View File

@@ -92,3 +92,4 @@ spec: # Ingress 规则
name: nodejs-demo # Service 名称
port: # Service 端口
number: 80 # 端口号

View File

@@ -94,3 +94,4 @@ spec: # Ingress 规则
name: nodejs-demo # Service 名称
port: # Service 端口
number: 80 # 端口号

View File

@@ -107,3 +107,4 @@ spec: # Ingress 规则
name: nodejs-demo # Service 名称
port: # Service 端口
number: 80 # 端口号

View File

@@ -125,3 +125,4 @@ spec: # Ingress 规则
name: nodejs-demo # Service 名称
port: # Service 端口
number: 80 # 端口号

View File

@@ -126,3 +126,4 @@ spec: # Ingress 规则
name: nodejs-demo # Service 名称
port: # Service 端口
number: 80 # 端口号

View File

@@ -131,3 +131,4 @@ spec: # Ingress 规则
name: nodejs-demo # Service 名称
port: # Service 端口
number: 80 # 端口号

View File

@@ -138,3 +138,4 @@ spec: # Ingress 规则
name: nodejs-demo # Service 名称
port: # Service 端口
number: 80 # 端口号

View File

@@ -155,3 +155,4 @@ spec: # HPA 规格
target: # 目标值
type: Utilization # 目标类型:利用率
averageUtilization: 50 # 目标平均 CPU 利用率(%

View File

@@ -30,7 +30,7 @@
```bash
# 仓库根目录
kubectl apply -f ansible/files/nodejs-demo/04-01-nodejs-demo.yaml
kubectl apply -f ansible/files/04-01-nodejs-demo/04-01-nodejs-demo.yaml
```
或使用 Ansible`ansible/playbooks/nodejs-demo-apply.yml`,变量 `nodejs_demo_manifest` 指定文件名。
@@ -38,5 +38,6 @@ kubectl apply -f ansible/files/nodejs-demo/04-01-nodejs-demo.yaml
## dry-run
```bash
kubectl apply --dry-run=client -f ansible/files/nodejs-demo/04-01-nodejs-demo.yaml
kubectl apply --dry-run=client -f ansible/files/04-01-nodejs-demo/04-01-nodejs-demo.yaml
```

View File

@@ -6,3 +6,4 @@ metadata: # Secret 元信息
namespace: default # 命名空间
stringData: # 明文键值(创建时会转换为 data
API_TOKEN: "replace-me" # 示例 token请替换勿提交真实密钥

View File

@@ -0,0 +1,13 @@
# 04-02-nodejs-镜像与运行命令(占位)
对应文档:[`docs/04-02-nodejs-镜像与运行命令.md`](../../docs/04-02-nodejs-镜像与运行命令.md)
## 真源清单(复用 04-01 累积目录)
- 真源目录:`ansible/files/04-01-nodejs-demo/`
- 对应累积清单:`04-02-nodejs-demo.yaml`
```bash
kubectl apply -f ansible/files/04-01-nodejs-demo/04-02-nodejs-demo.yaml
```

View File

@@ -0,0 +1,13 @@
# 04-03-nodejs-环境变量与配置注入(占位)
对应文档:[`docs/04-03-nodejs-环境变量与配置注入.md`](../../docs/04-03-nodejs-环境变量与配置注入.md)
## 真源清单(复用 04-01 累积目录)
- 真源目录:`ansible/files/04-01-nodejs-demo/`
- 对应累积清单:`04-03-nodejs-demo.yaml`
```bash
kubectl apply -f ansible/files/04-01-nodejs-demo/04-03-nodejs-demo.yaml
```

View File

@@ -0,0 +1,13 @@
# 04-04-nodejs-端口与Service占位
对应文档:[`docs/04-04-nodejs-端口与Service.md`](../../docs/04-04-nodejs-端口与Service.md)
## 真源清单(复用 04-01 累积目录)
- 真源目录:`ansible/files/04-01-nodejs-demo/`
- 对应累积清单:`04-04-nodejs-demo.yaml`
```bash
kubectl apply -f ansible/files/04-01-nodejs-demo/04-04-nodejs-demo.yaml
```

View File

@@ -0,0 +1,13 @@
# 04-05-nodejs-资源请求与限制(占位)
对应文档:[`docs/04-05-nodejs-资源请求与限制.md`](../../docs/04-05-nodejs-资源请求与限制.md)
## 真源清单(复用 04-01 累积目录)
- 真源目录:`ansible/files/04-01-nodejs-demo/`
- 对应累积清单:`04-05-nodejs-demo.yaml`
```bash
kubectl apply -f ansible/files/04-01-nodejs-demo/04-05-nodejs-demo.yaml
```

View File

@@ -0,0 +1,13 @@
# 04-06-nodejs-探针与健康检查(占位)
对应文档:[`docs/04-06-nodejs-探针与健康检查.md`](../../docs/04-06-nodejs-探针与健康检查.md)
## 真源清单(复用 04-01 累积目录)
- 真源目录:`ansible/files/04-01-nodejs-demo/`
- 对应累积清单:`04-06-nodejs-demo.yaml`
```bash
kubectl apply -f ansible/files/04-01-nodejs-demo/04-06-nodejs-demo.yaml
```

View File

@@ -0,0 +1,13 @@
# 04-07-nodejs-调度与亲和(占位)
对应文档:[`docs/04-07-nodejs-调度与亲和.md`](../../docs/04-07-nodejs-调度与亲和.md)
## 真源清单(复用 04-01 累积目录)
- 真源目录:`ansible/files/04-01-nodejs-demo/`
- 对应累积清单:`04-07-nodejs-demo.yaml`
```bash
kubectl apply -f ansible/files/04-01-nodejs-demo/04-07-nodejs-demo.yaml
```

View File

@@ -0,0 +1,13 @@
# 04-08-nodejs-安全上下文(占位)
对应文档:[`docs/04-08-nodejs-安全上下文.md`](../../docs/04-08-nodejs-安全上下文.md)
## 真源清单(复用 04-01 累积目录)
- 真源目录:`ansible/files/04-01-nodejs-demo/`
- 对应累积清单:`04-08-nodejs-demo.yaml`
```bash
kubectl apply -f ansible/files/04-01-nodejs-demo/04-08-nodejs-demo.yaml
```

View File

@@ -0,0 +1,13 @@
# 04-09-nodejs-存储与卷(占位)
对应文档:[`docs/04-09-nodejs-存储与卷.md`](../../docs/04-09-nodejs-存储与卷.md)
## 真源清单(复用 04-01 累积目录)
- 真源目录:`ansible/files/04-01-nodejs-demo/`
- 对应累积清单:`04-09-nodejs-demo.yaml`
```bash
kubectl apply -f ansible/files/04-01-nodejs-demo/04-09-nodejs-demo.yaml
```

View File

@@ -0,0 +1,13 @@
# 04-10-nodejs-Ingress与Traefik占位
对应文档:[`docs/04-10-nodejs-Ingress与Traefik.md`](../../docs/04-10-nodejs-Ingress与Traefik.md)
## 真源清单(复用 04-01 累积目录)
- 真源目录:`ansible/files/04-01-nodejs-demo/`
- 对应累积清单:`04-10-nodejs-demo.yaml`
```bash
kubectl apply -f ansible/files/04-01-nodejs-demo/04-10-nodejs-demo.yaml
```

View File

@@ -0,0 +1,13 @@
# 04-11-nodejs-副本与滚动发布(占位)
对应文档:[`docs/04-11-nodejs-副本与滚动发布.md`](../../docs/04-11-nodejs-副本与滚动发布.md)
## 真源清单(复用 04-01 累积目录)
- 真源目录:`ansible/files/04-01-nodejs-demo/`
- 对应累积清单:`04-11-nodejs-demo.yaml`
```bash
kubectl apply -f ansible/files/04-01-nodejs-demo/04-11-nodejs-demo.yaml
```

View File

@@ -0,0 +1,13 @@
# 04-12-nodejs-TLS与证书占位
对应文档:[`docs/04-12-nodejs-TLS与证书.md`](../../docs/04-12-nodejs-TLS与证书.md)
## 真源清单(复用 04-01 累积目录)
- 真源目录:`ansible/files/04-01-nodejs-demo/`
- 对应累积清单:`04-12-nodejs-demo.yaml`
```bash
kubectl apply -f ansible/files/04-01-nodejs-demo/04-12-nodejs-demo.yaml
```

View File

@@ -0,0 +1,13 @@
# 04-13-nodejs-HPA占位
对应文档:[`docs/04-13-nodejs-HPA.md`](../../docs/04-13-nodejs-HPA.md)
## 真源清单(复用 04-01 累积目录)
- 真源目录:`ansible/files/04-01-nodejs-demo/`
- 对应累积清单:`04-13-nodejs-demo.yaml`
```bash
kubectl apply -f ansible/files/04-01-nodejs-demo/04-13-nodejs-demo.yaml
```

View File

@@ -0,0 +1,9 @@
# 04-14-nodejs-GitOps与CI流水线占位
对应文档:[`docs/04-14-nodejs-GitOps与CI流水线.md`](../../docs/04-14-nodejs-GitOps与CI流水线.md)
## 说明
- 本篇为流程/方法论文档,通常不会提供一份固定可复用的 K8s 清单。
- 如需参考示例清单,可从 `ansible/files/04-01-nodejs-demo/` 选择对应阶段的累积 YAML。

View File

@@ -1,4 +1,28 @@
# docs/05-01-k3s-部署homer首页面板.md — 按需修改 host
# docs/05-01-k3s-部署homer首页面板.md — 按需修改 host、ConfigMap 内 config.yml
# Homer 官方镜像约定:自定义配置挂在容器内 /www/assets/config.yml见 b4bz/homer 说明)
# 若不想用 ConfigMap删除本文件最上方的 ConfigMap并去掉 Deployment 里 env/volumes/volumeMounts 三段
---
apiVersion: v1 # ConfigMap存放 Homer 的 config.yml 文本
kind: ConfigMap # 非机密配置,适合放导航 YAML
metadata: # 元数据
name: homer-config # 名称须与 Deployment 中 volume 引用一致
namespace: homer # 与 Deployment 同命名空间
data: # 键值:键名 config.yml 会映射为容器内文件名
config.yml: | # Homer 主配置(修改导航只改这里,不必为每个链接单独写 K8s YAML
---
title: "实验室导航" # 页面主标题
subtitle: "Homer" # 副标题
theme: default # 主题default / dark 等(见官方文档)
connectivityCheck: false # 是否探测链接可达(实验环境可先关)
columns: 3 # 桌面端列数
services: # 分组与书签(在此集中维护)
- name: "示例分组" # 分组名
icon: "fas fa-layer-group" # Font Awesome 图标类名
items: # 该分组下的链接列表
- name: "Homer 项目" # 卡片标题
url: "https://github.com/bastienwirtz/homer" # 跳转地址
target: "_blank" # 新标签页打开
---
apiVersion: apps/v1 # Deployment 使用的 API 版本
kind: Deployment # 工作负载Deployment管理 Pod 副本)
metadata: # 对该资源的标识信息
@@ -14,11 +38,25 @@ spec: # Deployment 期望状态
labels: # Pod 标签:用于 selector 匹配 Service/Deployment 等
app: homer # Pod 上的标签 app=homer
spec: # Pod 规范
volumes: # Pod 级卷:把 ConfigMap 挂进容器
- name: homer-config # 卷名,供 volumeMounts 引用
configMap: # 来自上方 homer-config
name: homer-config # ConfigMap 名称
items: # 只挂载需要的键,文件名与键名一致
- key: config.yml # ConfigMap.data 中的键
path: config.yml # 在挂载目录下生成的文件名
containers: # 容器列表(本例只有一个容器)
- name: homer # 容器名称(日志/调试中会用到)
image: b4bz/homer:latest # homer 镜像
image: b4bz/homer:latest # Homer 官方镜像Docker Hub 命名空间 b4bz
env: # 环境变量
- name: INIT_ASSETS # 启动时是否从镜像复制默认 assets
value: "0" # 使用 ConfigMap 提供 config.yml 时设为 0避免覆盖自定义配置
ports: # 容器端口声明(供探测/生成文档等使用)
- containerPort: 8080 # 容器监听端口homer 默认 8080
volumeMounts: # 把 config.yml 挂到 Homer 读取路径
- name: homer-config # 对应 volumes[].name
mountPath: /www/assets/config.yml # 官方镜像中配置文件路径
subPath: config.yml # 单文件挂载(不覆盖整个 /www/assets 目录)
---
apiVersion: v1 # Service 使用的 API 版本
kind: Service # 网络抽象:把一组 Pod 暴露为稳定的访问入口
@@ -51,3 +89,4 @@ spec: # Ingress 规则
name: homer # 后端 Service 名称
port: # 后端端口配置
number: 80 # 后端 Service 端口

View File

@@ -40,4 +40,5 @@ spec: # Ingress 规则
service: # 转发到 Service
name: onenav-external # 后端 Service 名称
port: # 后端端口
number: 80 # Service 端口
number: 80 # 端口

View File

@@ -13,3 +13,4 @@ build_armv7: # 任务名armv7 构建
tags: [armv7] # 仅匹配 armv7 Runner
script: # 执行脚本
- echo "build for armv7" # 示例输出

View File

@@ -4,6 +4,7 @@
|------|------|
| `gitlab-ci-minimal.example.yml` | `docs/05-04-k3s-配置gitlab-cicd.md` |
| `gitlab-ci-multi-arch-deploy.example.yml` | `docs/05-04-k3s-配置gitlab-cicd.md` |
| `gitlab-ci-runner-tags.example.yml` | `docs/05-03-k3s-安装gitlab-含runner.md` |
| `../05-03-gitlab-runner/gitlab-ci-runner-tags.example.yml` | `docs/05-03-k3s-安装gitlab-含runner.md` |
复制为 `.gitlab-ci.yml``include` 引用;变量与 Runner 以文档为准。

View File

@@ -18,3 +18,4 @@ deploy: # 任务名deploy
- kubectl --kubeconfig="$KUBECONFIG" apply -f manifests/ # 应用 manifests
only: # 触发条件(旧语法)
- main # 仅 main 分支触发

View File

@@ -12,3 +12,4 @@ deploy_arm64: # 任务名arm64 架构部署
script: # 执行脚本
- echo "$KUBE_CONFIG_CONTENT" > "$KUBECONFIG" # 写入 kubeconfig
- kubectl --kubeconfig="$KUBECONFIG" apply -f manifests/arm64/ # 部署 arm64 清单

View File

@@ -0,0 +1,9 @@
# 05-05-prometheus与grafana占位
对应文档:[`docs/05-05-prometheus与grafana.md`](../../docs/05-05-prometheus与grafana.md)
## 说明
- 监控栈通常通过 Helm Chart如 kube-prometheus-stack安装清单会随版本变化。
- 本目录仅用于 doc_id 对齐占位;后续若固化 values/Chart 版本,可在此补齐 manifests/values。

View File

@@ -25,3 +25,4 @@ spec: # CronJob 期望状态
persistentVolumeClaim: # 使用 PVC 作为存储来源
claimName: openlist-backup-pvc # 绑定的 PVC 名称(需保证存在)
restartPolicy: OnFailure # Pod 失败后重启策略:仅失败时重启

View File

@@ -0,0 +1,37 @@
# docs/05-07-openclaw局域网联机.md — 按需修改 NodePort/镜像
apiVersion: apps/v1 # Deployment API 版本
kind: Deployment # 部署控制器
metadata: # 元数据
name: openclaw-server # Deployment 名称
namespace: default # 命名空间
spec: # 期望状态
replicas: 1 # 副本数
selector: # 选择器
matchLabels: # 匹配 labels
app: openclaw-server # 标签值
template: # Pod 模板
metadata: # Pod 元信息
labels: # Pod 标签
app: openclaw-server # 标签值
spec: # Pod 规格
containers: # 容器
- name: openclaw-server # 容器名
image: ghcr.io/your/openclaw-server:latest # 镜像(按环境修改)
ports: # 容器端口
- containerPort: 27015 # 示例端口(按应用实际修改)
---
apiVersion: v1 # Service API 版本
kind: Service # Service
metadata: # 元数据
name: openclaw-server # Service 名称
namespace: default # 命名空间
spec: # 规格
type: NodePort # NodePort 暴露到节点
selector: # 选择后端 Pod
app: openclaw-server # 标签选择器
ports: # 端口列表
- name: game # 端口名
port: 27015 # Service 端口
targetPort: 27015 # Pod 端口
nodePort: 32715 # NodePort按需修改需在范围内

View File

@@ -0,0 +1,55 @@
# docs/05-09-openclaw-web-小游戏网页平台.md — 按需修改 Ingress host/镜像
apiVersion: apps/v1 # Deployment API 版本
kind: Deployment # 部署
metadata: # 元信息
name: openclaw-web # 名称
namespace: default # 命名空间
spec: # 规格
replicas: 1 # 副本数
selector: # 选择器
matchLabels: # 匹配标签
app: openclaw-web # 标签
template: # Pod 模板
metadata: # Pod 元信息
labels: # Pod 标签
app: openclaw-web # 标签
spec: # Pod 规格
containers: # 容器列表
- name: openclaw-web # 容器名
image: ghcr.io/your/openclaw-web:latest # 镜像(按环境修改)
ports: # 容器端口
- containerPort: 80 # Web 端口
---
apiVersion: v1 # Service API 版本
kind: Service # Service
metadata: # 元信息
name: openclaw-web # Service 名称
namespace: default # 命名空间
spec: # 规格
selector: # 选择后端 Pod
app: openclaw-web # 标签
ports: # 端口
- name: http # 名称
port: 80 # Service 端口
targetPort: 80 # Pod 端口
---
apiVersion: networking.k8s.io/v1 # Ingress API 版本
kind: Ingress # Ingress
metadata: # 元信息
name: openclaw-web # 名称
namespace: default # 命名空间
annotations: # 注解
traefik.ingress.kubernetes.io/router.entrypoints: web # Traefik entrypoint
spec: # 规格
rules: # 规则
- host: openclaw.example.com # 域名(按环境修改)
http: # HTTP
paths: # 路径
- path: / # 根路径
pathType: Prefix # 前缀匹配
backend: # 后端
service: # Service
name: openclaw-web # 后端 service
port: # 端口
number: 80 # 端口号

View File

@@ -0,0 +1,9 @@
# 06-01-k3s-networkpolicy-故障排查(占位)
对应文档:[`docs/06-01-k3s-networkpolicy-故障排查.md`](../../docs/06-01-k3s-networkpolicy-故障排查.md)
## 说明
- 本篇为排障手册/命令集合,**不提供固定可部署清单**。
- 本目录仅用于 doc_id 对齐占位。

View File

@@ -0,0 +1,9 @@
# 06-02-运维小结(占位)
对应文档:[`docs/06-02-运维小结.md`](../../docs/06-02-运维小结.md)
## 说明
- 本篇为运维建议/巡检要点总结,通常不对应单一可部署清单。
- 本目录仅用于 doc_id 对齐占位。

View File

@@ -0,0 +1,12 @@
# 06-03-k3s-自动备份与恢复-openlist-webdav对齐 README
对应文档:[`docs/06-03-k3s-自动备份与恢复-openlist-webdav.md`](../../docs/06-03-k3s-自动备份与恢复-openlist-webdav.md)
## 真源清单目录
本篇可部署清单当前收敛在:
- `ansible/files/06-03-openlist-webdav/`
说明:该目录名未镜像 docs 文件名为满足“doc_id 目录对齐”口径,本目录仅作为桥接与入口。

View File

@@ -25,3 +25,4 @@ spec: # CronJob 规格
hostPath: # 使用宿主机路径
path: /data/app # 宿主机实际目录(按环境修改)
restartPolicy: OnFailure # 失败时重启

View File

@@ -22,3 +22,4 @@ spec: # Job 规格
hostPath: # 使用宿主机目录作为存储
path: /data/app # 节点上的真实数据目录(按实际修改)
restartPolicy: OnFailure # 失败时重启,成功后结束

View File

@@ -1,83 +0,0 @@
# 03-03 Traefik Dashboard + ACME唯一清单推荐
# =============================================================================
# 含HelmChartConfiglocal-path 持久化 /data + ACME Cloudflare DNS-01 + Dashboard
# + IngressRoute/dashboard、/api
# acme.json 与 chart persistence 均落在 /dataPod 重建后证书仍在nodeSelector 须固定单节点RWO
#
# 部署kubectl apply -f ansible/files/traefik-dashboard-acme/traefik-dashboard-acme.yaml
# 使用前:替换 <YOUR_REAL_EMAIL>、nodeSelector 主机名Secret cloudflare-api-token 已存在(见 03-02
# 全集群只能有一份 HelmChartConfig metadata.name=traefik
#
# --- 不要 Dashboard 时 ---
# 删除文末 IngressRoute 整段;并在 valuesContent 中删掉 ports可选、--api.dashboard、--api.insecure
#
# --- 临时不用持久化(不推荐)---
# 将 persistence.enabled 改为 false 并删掉 persistence 下其余字段(证书可能随 Pod 丢失)
# =============================================================================
---
apiVersion: helm.cattle.io/v1 # HelmChartConfig 所在的 API 版本
kind: HelmChartConfig # HelmChartConfig给 K3s/Helm 注入 values 的资源
metadata: # 资源标识信息
name: traefik # chart 对应的 name需要与 Traefik chart/约定一致)
namespace: kube-system # Traefik 通常运行在 kube-system
spec: # 该资源要注入 chart 的配置
valuesContent: |- # 以“字符串形式的 YAML”注入到 Helm chart values由 chart 解析)
ports: # 暴露 entrypoints 给集群入口
web: # HTTP entrypoint
expose: true # 允许暴露 web
websecure: # HTTPS entrypoint
expose: true # 允许暴露 websecure
persistence: # chart 持久化配置:为 /data 挂载 PVC
enabled: true # 开启持久卷
name: data # chart 创建/引用的卷名PVC 等)
accessMode: ReadWriteOnce # RWO同一时间只能在一个节点挂载
size: 512Mi # 请求容量local-path 会据此创建本地卷)
storageClass: local-path # 使用 K3s 的 local-path-provisioner
path: /data # 容器内挂载目录(与 acme.storage 一致)
additionalArguments: # 额外传给 Traefik 的 CLI 参数
- "--api.dashboard=true" # 打开 dashboard 功能
- "--api.insecure=true" # k8s允许 dashboard 在入口可用(注意安全)
- "--log.level=INFO" # 日志级别
- "--certificatesresolvers.cloudflare.acme.dnschallenge.resolvers=1.1.1.1:53,1.0.0.1:53" # DNS 解析器列表(用于 DNS-01
- "--certificatesresolvers.cloudflare.acme.email=<YOUR_REAL_EMAIL>" # ACME 注册邮箱
- "--certificatesresolvers.cloudflare.acme.storage=/data/acme.json" # 证书与账户存储(容器内 /data
# - "--certificatesresolvers.cloudflare.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory" # 测试用,上线前删除
- "--certificatesresolvers.cloudflare.acme.dnschallenge.provider=cloudflare" # DNS-01 providercloudflare
- "--certificatesresolvers.cloudflare.acme.dnschallenge.propagation.delayBeforeChecks=600" # DNS-01 propagation 等待秒数
- "--ping=true" # 开启 ping healthcheck
- "--ping.entryPoint=websecure" # ping 使用 websecure(HTTPS) entrypoint
- "--entrypoints.web.proxyProtocol.trustedIPs=192.168.2.0/24" # web entrypoint 信任的代理网段
- "--entrypoints.websecure.proxyProtocol.trustedIPs=192.168.2.0/24" # websecure entrypoint 信任的代理网段
env: # 环境变量注入
- name: CF_DNS_API_TOKEN # 供 Traefik 使用的 Cloudflare Token 环境变量名
valueFrom: # 从 Secret 挂载
secretKeyRef: # Secret 引用方式
name: cloudflare-api-token # Secret 名
key: api-token # Secret 内 key
nodeSelector: # 将 Traefik Pod 固定到指定节点(避免 local-path RWO 迁移导致丢数据)
kubernetes.io/hostname: ylc61 # 目标节点主机名
---
# 显式 IngressRoute与 03-01 一致,确保 /dashboard 可达; Helm ingressRoute.dashboard 在 K3s chart 中未必生效)
apiVersion: traefik.io/v1alpha1 # IngressRoute API 版本
kind: IngressRoute # Traefik 路由 CRD
metadata: # IngressRoute 元信息
name: traefik-dashboard # 路由名称
namespace: kube-system # 命名空间
spec: # IngressRoute 规则
entryPoints: # 入口点列表
- web # 使用 web(HTTP) 入口
routes: # 路由规则列表
- match: PathPrefix(`/dashboard`) || PathPrefix(`/api`) # 匹配 Dashboard/API 路径前缀
kind: Rule # 规则类型
services: # 后端服务
- name: api@internal # Traefik 内置 API 服务
kind: TraefikService # 服务类型

View File

@@ -9,6 +9,20 @@ k3s_version: "" # 为空表示用 get.k3s.io 默认最新
k3s_data_dir: "/storage"
k3s_server_ip: "192.168.2.61"
# 安装 k3s 前校验:/storage 为挂载点且与 / 不同设备(实验室 10G+32G 建议 true「目录式假 /storage」旧环境可 false
k3s_verify_storage_mount: true
# 可选:由 playbooks/k3s-prepare-storage.yml 对第二块整盘分区、格式化并挂载到 k3s_data_dir会清空该盘见 01-06
k3s_prepare_storage: false
# k3s_data_disk_device: "/dev/vdb"
# NVMe 整盘一般为 /dev/nvme0n1首分区为 /dev/nvme0n1p1playbook 会按设备名自动加 1 或 p1
# Longhorn Helmplaybooks/longhorn-install.yml
longhorn_chart_version: "1.7.2"
longhorn_install_node_packages: true
# 是否在 longhorn-install 末尾应用本仓库 local-path 实验室 ConfigMap
longhorn_apply_local_path_lab: false
# 可选:是否管理 /etc/hosts、firewalld 基线
k3s_manage_hosts: true
k3s_manage_firewalld: true

View File

@@ -0,0 +1,37 @@
---
# 仅应用本仓库 local-path 实验室 ConfigMap不安装 Longhorn。在 k3s_server 上执行。
# 与 docs/03-05 中「方法一」一致真源ansible/files/03-05-local-path-config/local-path-config-lab.json
- name: Apply local-path-config lab JSON
hosts: k3s_server
become: true
run_once: true
vars:
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
local_path_json_src: "{{ playbook_dir }}/../files/03-05-local-path-config/local-path-config-lab.json"
local_path_json_dest: /root/local-path-config-lab.json
tasks:
- name: Copy local-path lab json
ansible.builtin.copy:
src: "{{ local_path_json_src }}"
dest: "{{ local_path_json_dest }}"
mode: "0644"
- name: Apply local-path-config ConfigMap
ansible.builtin.shell: |
set -e
KUBECONFIG={{ k3s_kubeconfig }} kubectl -n kube-system create configmap local-path-config \
--from-file=config.json={{ local_path_json_dest }} \
--dry-run=client -o yaml | KUBECONFIG={{ k3s_kubeconfig }} kubectl apply -f -
args:
executable: /bin/bash
changed_when: true
- name: Restart local-path-provisioner if present
ansible.builtin.shell: |
KUBECONFIG={{ k3s_kubeconfig }} kubectl -n kube-system rollout restart deploy/local-path-provisioner
args:
executable: /bin/bash
register: lp_restart
failed_when: false
changed_when: lp_restart.rc == 0

View File

@@ -1,4 +1,33 @@
---
- name: Verify /storage is a separate mount (optional)
hosts: k3s_nodes
become: true
tasks:
- name: Check / and /storage mount sources
when: k3s_verify_storage_mount | default(false) | bool
block:
- name: Get mount source for /
ansible.builtin.command: findmnt -n -o SOURCE /
register: mnt_root
changed_when: false
- name: Get mount source for /storage
ansible.builtin.command: findmnt -n -o SOURCE /storage
register: mnt_storage
changed_when: false
failed_when: false
- name: Assert /storage is mounted on a different device than /
ansible.builtin.assert:
that:
- mnt_storage.rc == 0
- (mnt_root.stdout | trim | length) > 0
- (mnt_storage.stdout | trim | length) > 0
- (mnt_root.stdout | trim) != (mnt_storage.stdout | trim)
fail_msg: >-
/storage must be a mount point on a block device different from /.
See docs/00-04-部署环境说明.md and docs/01-06-节点初始化-ansible-实践.md
- name: Init base system
hosts: k3s_nodes
become: true

View File

@@ -0,0 +1,106 @@
---
# 可选在空白数据盘上创建单分区、ext4、fstab 并挂载到 k3s_data_dir默认 /storage
# 启用前在 group_vars/all.yml 设置 k3s_prepare_storage: true 与 k3s_data_disk_device如 /dev/vdb
# 会清空该磁盘上的数据。若 /storage 已是挂载点则跳过。
- name: Prepare data disk and mount to k3s_data_dir
hosts: k3s_nodes
become: true
tasks:
- name: Skip notice when storage prep disabled
ansible.builtin.debug:
msg: "k3s_prepare_storage is false — skipping (see group_vars/all.yml)"
when: not (k3s_prepare_storage | default(false) | bool)
- name: Prepare block storage for k3s_data_dir
when: k3s_prepare_storage | default(false) | bool
block:
- name: Require k3s_data_disk_device when k3s_prepare_storage is true
ansible.builtin.assert:
that:
- k3s_data_disk_device is defined
- (k3s_data_disk_device | string | length) > 0
fail_msg: "Set k3s_data_disk_device (e.g. /dev/vdb) in group_vars or host_vars"
- name: Verify k3s_data_disk_device is a block device
ansible.builtin.command: test -b {{ k3s_data_disk_device }}
changed_when: false
- name: Check whether k3s_data_dir is already a mountpoint
ansible.builtin.command: mountpoint -q {{ k3s_data_dir }}
register: mp_k3s
changed_when: false
failed_when: false
- name: Skip when k3s_data_dir already mounted
ansible.builtin.debug:
msg: "{{ k3s_data_dir }} already mounted — skipping partitioning on {{ inventory_hostname }}"
when: mp_k3s.rc == 0
- name: Install partitioning and filesystem tools
ansible.builtin.package:
name:
- parted
- e2fsprogs
state: present
when: mp_k3s.rc != 0
- name: Compute first partition path (nvme*n* -> p1, else 1)
ansible.builtin.set_fact:
k3s_data_partition: >-
{{ k3s_data_disk_device }}{{ 'p1' if (k3s_data_disk_device | regex_search('nvme[0-9]+n[0-9]+$')) else '1' }}
when: mp_k3s.rc != 0
- name: Create GPT and single ext4 partition
ansible.builtin.command: >-
parted -s {{ k3s_data_disk_device }} mklabel gpt mkpart primary ext4 0% 100%
args:
creates: "{{ k3s_data_partition }}"
when: mp_k3s.rc != 0
- name: Wait for partition node in /dev
ansible.builtin.wait_for:
path: "{{ k3s_data_partition }}"
state: present
timeout: 60
when: mp_k3s.rc != 0
- name: Detect existing filesystem on partition
ansible.builtin.command: blkid -s TYPE -o value {{ k3s_data_partition }}
register: fs_type
changed_when: false
failed_when: false
when: mp_k3s.rc != 0
- name: Create ext4 on partition
ansible.builtin.command: mkfs.ext4 -F {{ k3s_data_partition }}
when:
- mp_k3s.rc != 0
- (fs_type.stdout | default('') | trim | length) == 0
- name: Read UUID of partition
ansible.builtin.command: blkid -s UUID -o value {{ k3s_data_partition }}
register: blk_uuid
changed_when: false
when: mp_k3s.rc != 0
- name: Ensure mount directory exists
ansible.builtin.file:
path: "{{ k3s_data_dir }}"
state: directory
mode: "0755"
when: mp_k3s.rc != 0
- name: Add fstab entry for k3s_data_dir
ansible.builtin.lineinfile:
path: /etc/fstab
regexp: "^UUID={{ blk_uuid.stdout | trim }}\\s"
line: "UUID={{ blk_uuid.stdout | trim }} {{ k3s_data_dir }} ext4 defaults,nofail 0 2"
create: true
mode: "0644"
when: mp_k3s.rc != 0
- name: Mount all from fstab
ansible.builtin.command: mount -a
changed_when: true
when: mp_k3s.rc != 0

View File

@@ -0,0 +1,251 @@
---
# Helm 安装 Longhorn与 docs/03-07 一致)。在控制节点执行,依赖 KUBECONFIG=/etc/rancher/k3s/k3s.yaml
# 变量group_vars/all.yml 中 longhorn_chart_version、longhorn_install_node_packages、longhorn_apply_local_path_lab
- name: Longhorn node packages (iSCSI, NFS client)
hosts: k3s_nodes
become: true
tasks:
- name: Install Longhorn OS dependencies
when: longhorn_install_node_packages | default(true) | bool
block:
- name: Install iscsi + nfs (dnf/yum)
ansible.builtin.package:
name:
- iscsi-initiator-utils
- nfs-utils
state: present
- name: Enable iscsid
ansible.builtin.systemd:
name: iscsid
enabled: true
state: started
- name: Ensure Longhorn data subdirectory exists on all nodes
ansible.builtin.file:
path: "{{ k3s_data_dir }}/longhorn"
state: directory
mode: "0700"
- name: Pre-pull Longhorn images on all nodes (optional, avoid DockerHub EOF/ImagePullBackOff)
when: longhorn_prepull_images | default(true) | bool
ansible.builtin.shell: |
set -e
CTR="ctr --address /run/k3s/containerd/containerd.sock -n k8s.io"
imgs=(
"docker.io/longhornio/longhorn-manager:v{{ longhorn_chart_version }}"
"docker.io/longhornio/longhorn-ui:v{{ longhorn_chart_version }}"
"docker.io/longhornio/longhorn-share-manager:v{{ longhorn_chart_version }}"
"docker.io/longhornio/longhorn-engine:v{{ longhorn_chart_version }}"
"docker.io/longhornio/longhorn-instance-manager:v{{ longhorn_chart_version }}"
"docker.io/longhornio/backing-image-manager:v{{ longhorn_chart_version }}"
"docker.io/longhornio/support-bundle-kit:v0.0.45"
)
for img in "${imgs[@]}"; do
ok=0
for i in 1 2 3 4 5; do
echo "[pull] $img (try $i/5)"
if $CTR images pull "$img"; then
ok=1
break
fi
sleep $((i * 3))
done
if [ "$ok" -ne 1 ]; then
echo "[ERR] failed pulling $img after retries"
exit 1
fi
done
args:
executable: /bin/bash
changed_when: true
- name: Install Longhorn with Helm on first server
hosts: k3s_server
become: true
run_once: true
vars:
longhorn_values_src: "{{ playbook_dir }}/../files/03-07-longhorn/values-lab.yaml"
longhorn_values_dest: /root/longhorn-values-lab.yaml
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
tasks:
- name: Install helm package (Fedora/RHEL family)
ansible.builtin.package:
name: helm
state: present
ignore_errors: true
register: helm_pkg
- name: Hint if helm package install failed (install Helm 3 manually if needed)
ansible.builtin.debug:
msg: "dnf/yum 未装上 helm 时,请见 https://helm.sh/docs/intro/install/"
when: helm_pkg.failed | default(false)
- name: Fail if helm binary still unavailable
ansible.builtin.command: which helm
register: helm_which
changed_when: false
failed_when: helm_which.rc != 0
- name: Copy lab values to server
ansible.builtin.copy:
src: "{{ longhorn_values_src }}"
dest: "{{ longhorn_values_dest }}"
mode: "0600"
- name: Ensure longhorn-system namespace is not stuck Terminating (force finalize if needed)
ansible.builtin.shell: |
set -e
export KUBECONFIG={{ k3s_kubeconfig }}
ns="longhorn-system"
phase="$(kubectl get ns "$ns" -o jsonpath='{.status.phase}' 2>/dev/null || true)"
if [ "$phase" = "Terminating" ]; then
echo "[WARN] namespace $ns is Terminating; force finalize to unblock install"
kubectl get ns "$ns" -o json > /tmp/ns.json
python3 -c "import json; obj=json.load(open('/tmp/ns.json')); obj.setdefault('spec',{}); obj['spec']['finalizers']=[]; json.dump(obj, open('/tmp/ns-finalize.json','w'))"
kubectl replace --raw "/api/v1/namespaces/$ns/finalize" -f /tmp/ns-finalize.json >/dev/null
fi
args:
executable: /bin/bash
changed_when: true
failed_when: false
- name: Ensure longhorn Helm repo
ansible.builtin.shell: |
set -e
if ! helm repo list 2>/dev/null | grep -q '^longhorn'; then
helm repo add longhorn https://charts.longhorn.io
fi
helm repo update
environment:
KUBECONFIG: "{{ k3s_kubeconfig }}"
args:
executable: /bin/bash
changed_when: true
- name: Delete leftover longhorn PriorityClass (cluster-scoped) to avoid Helm ownership conflicts
ansible.builtin.shell: |
set -e
KUBECONFIG={{ k3s_kubeconfig }} kubectl delete priorityclass longhorn-critical --ignore-not-found=true
args:
executable: /bin/bash
changed_when: true
failed_when: false
- name: Delete leftover Longhorn CRDs (cluster-scoped) to avoid Helm ownership conflicts
ansible.builtin.shell: |
set -e
export KUBECONFIG={{ k3s_kubeconfig }}
crd_list="$(kubectl get crd -o name 2>/dev/null | grep 'longhorn.io' || true)"
if [ -n "$crd_list" ]; then
echo "$crd_list" | while read -r crd; do
[ -z "$crd" ] && continue
timeout 20s kubectl delete "$crd" --ignore-not-found=true || true
done
fi
args:
executable: /bin/bash
changed_when: true
failed_when: false
- name: Delete leftover Longhorn ClusterRole/ClusterRoleBinding (cluster-scoped)
ansible.builtin.shell: |
set -e
export KUBECONFIG={{ k3s_kubeconfig }}
role_list="$(kubectl get clusterrole -o name 2>/dev/null | grep 'longhorn' || true)"
if [ -n "$role_list" ]; then
echo "$role_list" | while read -r role; do
[ -z "$role" ] && continue
timeout 20s kubectl delete "$role" --ignore-not-found=true || true
done
fi
binding_list="$(kubectl get clusterrolebinding -o name 2>/dev/null | grep 'longhorn' || true)"
if [ -n "$binding_list" ]; then
echo "$binding_list" | while read -r binding; do
[ -z "$binding" ] && continue
timeout 20s kubectl delete "$binding" --ignore-not-found=true || true
done
fi
args:
executable: /bin/bash
changed_when: true
failed_when: false
- name: Cleanup leftover Helm release records for Longhorn (default + longhorn-system)
ansible.builtin.shell: |
set -e
export KUBECONFIG={{ k3s_kubeconfig }}
# 有些失败/中断的安装会把 release secret 留在 default 或 longhorn-system导致后续
# - "cannot re-use a name that is still in use"
# - cluster-scoped 资源的 meta.helm.sh/release-namespace 注解冲突
for ns in longhorn-system default; do
if helm -n "$ns" list --all 2>/dev/null | grep -q '^longhorn'; then
# uninstall 可能卡住(例如 uninstall job / hook避免阻塞整个自动化流程
timeout 120s helm -n "$ns" uninstall longhorn --no-hooks || true
fi
sec_list="$(kubectl -n "$ns" get secret -o name 2>/dev/null | grep '^secret/sh\\.helm\\.release\\.v1\\.longhorn\\.' || true)"
if [ -n "$sec_list" ]; then
echo "$sec_list" | xargs -n1 kubectl -n "$ns" delete --ignore-not-found=true
fi
done
environment:
KUBECONFIG: "{{ k3s_kubeconfig }}"
args:
executable: /bin/bash
changed_when: true
failed_when: false
- name: Helm upgrade/install Longhorn失败兜底install --replace
ansible.builtin.shell: |
set -e
helm upgrade --install longhorn longhorn/longhorn --namespace longhorn-system --create-namespace -f {{ longhorn_values_dest }} --version {{ longhorn_chart_version }} --wait --timeout 15m || helm install --replace longhorn longhorn/longhorn --namespace longhorn-system --create-namespace -f {{ longhorn_values_dest }} --version {{ longhorn_chart_version }} --wait --timeout 15m
environment:
KUBECONFIG: "{{ k3s_kubeconfig }}"
args:
executable: /bin/bash
register: helm_longhorn
changed_when: true
- name: Apply local-path-config lab defaults (optional)
hosts: k3s_server
become: true
run_once: true
vars:
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
local_path_json_src: "{{ playbook_dir }}/../files/03-05-local-path-config/local-path-config-lab.json"
local_path_json_dest: /root/local-path-config-lab.json
tasks:
- name: Apply local-path-config lab defaults (optional)
when: longhorn_apply_local_path_lab | default(false) | bool
block:
- name: Copy local-path lab json
ansible.builtin.copy:
src: "{{ local_path_json_src }}"
dest: "{{ local_path_json_dest }}"
mode: "0644"
- name: Apply local-path-config ConfigMap
ansible.builtin.shell: |
set -e
KUBECONFIG={{ k3s_kubeconfig }} kubectl -n kube-system create configmap local-path-config \
--from-file=config.json={{ local_path_json_dest }} \
--dry-run=client -o yaml | KUBECONFIG={{ k3s_kubeconfig }} kubectl apply -f -
args:
executable: /bin/bash
changed_when: true
- name: Restart local-path-provisioner if present
ansible.builtin.shell: |
KUBECONFIG={{ k3s_kubeconfig }} kubectl -n kube-system rollout restart deploy/local-path-provisioner
args:
executable: /bin/bash
register: lp_restart
failed_when: false
changed_when: lp_restart.rc == 0

View File

@@ -3,7 +3,7 @@
# 对应文档docs/02-05-nginx-验证矩阵-一键部署.md02-0102-04 分篇已整合)
#
# 说明:复制 manifests → kubectl apply → 等待 Pod 就绪 → 验证 Pod 节点分布 → curl 16 目标
# manifestsansible/files/nginx-matrix/M1 control-plane / M2 ylc61 / M3 worker / M4 ylc64按实际修改 02/04 hostname
# manifestsansible/files/02-05-nginx-matrix/M1 control-plane / M2 ylc61 / M3 worker / M4 ylc64按实际修改 02/04 hostname
#
# 执行(在 ansible/ 目录下):
# ansible-playbook -i inventory.ini playbooks/nginx-matrix-deploy.yml
@@ -15,8 +15,8 @@
run_once: true
vars:
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
# manifests 在 ansible/files/nginx-matrix/,与 playbook 同项目
manifests_path: "{{ playbook_dir }}/../files/nginx-matrix"
# manifests 在 ansible/files/02-05-nginx-matrix/,与 playbook 同项目
manifests_path: "{{ playbook_dir }}/../files/02-05-nginx-matrix"
tasks:
- name: Ensure manifests path exists
ansible.builtin.stat:

View File

@@ -3,7 +3,7 @@
# 对应文档docs/03-02-k3s-traefik-acme.md
#
# 说明:复制 TLS + HTTP-only manifests → 自动删除已存在的不含 TLS 的 nginx 矩阵02-05→ kubectl apply含 TLS 与 HTTP-only 共 8 个路由)→ 等待 Pod 就绪 → HTTP-only / HTTPS curl 矩阵验证test01test04.jackadam.top
# manifestsansible/files/nginx-matrix-tls/,域名为 test01test04.jackadam.topM2/M4 hostname 按实际修改Ingress/IngressRoute 中 TLS 路由仅绑定 websecureHTTP-only 路由仅绑定 web
# manifestsansible/files/03-02-nginx-matrix-tls/,域名为 test01test04.jackadam.topM2/M4 hostname 按实际修改Ingress/IngressRoute 中 TLS 路由仅绑定 websecureHTTP-only 路由仅绑定 web
# 前置:已按 03-02 配置 ACMESecret + traefik-acme.yaml且 test01test04.jackadam.top 已解析到入口 IP
#
# 执行(在 ansible/ 目录下):
@@ -18,7 +18,7 @@
vars:
# mode 由 -e mode=cleanup 传入,未传时默认为 deploy勿在 vars 中写 mode: "{{ mode | default('deploy') }}" 会递归)
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
manifests_path: "{{ playbook_dir }}/../files/nginx-matrix-tls"
manifests_path: "{{ playbook_dir }}/../files/03-02-nginx-matrix-tls"
tls_domains:
- test01.jackadam.top
- test02.jackadam.top

View File

@@ -1,5 +1,5 @@
---
# 一键应用 Node.js demo 清单(与 docs/04-0104-13 + ansible/files/nodejs-demo 对齐)
# 一键应用 Node.js demo 清单(与 docs/04-0104-13 + ansible/files/04-01-nodejs-demo 对齐)
#
# 执行(在仓库根目录):
# ansible-playbook -i ansible/inventory.ini ansible/playbooks/nodejs-demo-apply.yml \
@@ -13,7 +13,7 @@
vars:
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
nodejs_demo_manifest: "04-01-nodejs-demo.yaml"
manifests_dir: "{{ playbook_dir }}/../files/nodejs-demo"
manifests_dir: "{{ playbook_dir }}/../files/04-01-nodejs-demo"
tasks:
- name: Ensure manifest file exists
ansible.builtin.stat:

View File

@@ -0,0 +1,10 @@
- name: "00-01 noop verify"
hosts: localhost
gather_facts: false
vars:
repo_root: "{{ playbook_dir }}/../../.."
doc_id: "00-01"
doc_filename: "00-01-k3s-基础概念.md"
tasks:
- ansible.builtin.import_tasks: "{{ playbook_dir }}/_noop-tasks.yml"

View File

@@ -0,0 +1,10 @@
- name: "00-04 noop verify"
hosts: localhost
gather_facts: false
vars:
repo_root: "{{ playbook_dir }}/../../.."
doc_id: "00-04"
doc_filename: "00-04-部署环境说明.md"
tasks:
- ansible.builtin.import_tasks: "{{ playbook_dir }}/_noop-tasks.yml"

View File

@@ -0,0 +1,24 @@
- name: "01-01 k3s baseline verify (nodes + core deploys)"
hosts: k3s_server
become: true
run_once: true
vars:
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
tasks:
- name: kubectl get nodes
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl get nodes -o wide
changed_when: false
- name: kube-system pods summary
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl get pods -n kube-system -o wide
changed_when: false
- name: Assert core components exist (coredns, traefik)
ansible.builtin.shell: |
set -e
KUBECONFIG={{ k3s_kubeconfig }} kubectl -n kube-system get deploy coredns
KUBECONFIG={{ k3s_kubeconfig }} kubectl -n kube-system get deploy traefik
args:
executable: /bin/bash
changed_when: false

View File

@@ -0,0 +1,11 @@
- name: "01-02 k3s baseline verify (nodes)"
hosts: k3s_server
become: true
run_once: true
vars:
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
tasks:
- name: kubectl get nodes
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl get nodes -o wide
changed_when: false

View File

@@ -0,0 +1,10 @@
- name: "01-03 noop verify"
hosts: localhost
gather_facts: false
vars:
repo_root: "{{ playbook_dir }}/../../.."
doc_id: "01-03"
doc_filename: "01-03-armv7-standalone-docker.md"
tasks:
- ansible.builtin.import_tasks: "{{ playbook_dir }}/_noop-tasks.yml"

View File

@@ -0,0 +1,10 @@
- name: "01-04 noop verify"
hosts: localhost
gather_facts: false
vars:
repo_root: "{{ playbook_dir }}/../../.."
doc_id: "01-04"
doc_filename: "01-04-双控制节点ha.md"
tasks:
- ansible.builtin.import_tasks: "{{ playbook_dir }}/_noop-tasks.yml"

View File

@@ -0,0 +1,10 @@
- name: "01-05 noop verify"
hosts: localhost
gather_facts: false
vars:
repo_root: "{{ playbook_dir }}/../../.."
doc_id: "01-05"
doc_filename: "01-05-armv7-nfs服务安装.md"
tasks:
- ansible.builtin.import_tasks: "{{ playbook_dir }}/_noop-tasks.yml"

View File

@@ -0,0 +1,11 @@
- name: "01-06 k3s baseline verify (kube-system pods)"
hosts: k3s_server
become: true
run_once: true
vars:
k3s_kubeconfig: /etc/rancher/k3s/k3s.yaml
tasks:
- name: kube-system pods summary
ansible.builtin.shell: KUBECONFIG={{ k3s_kubeconfig }} kubectl get pods -n kube-system -o wide
changed_when: false

View File

@@ -0,0 +1,10 @@
- name: "01-07 noop verify"
hosts: localhost
gather_facts: false
vars:
repo_root: "{{ playbook_dir }}/../../.."
doc_id: "01-07"
doc_filename: "01-07-openwrt-haproxy.md"
tasks:
- ansible.builtin.import_tasks: "{{ playbook_dir }}/_noop-tasks.yml"

View File

@@ -0,0 +1,10 @@
- name: "02-00 noop verify"
hosts: localhost
gather_facts: false
vars:
repo_root: "{{ playbook_dir }}/../../.."
doc_id: "02-00"
doc_filename: "02-00-nginx-系列说明.md"
tasks:
- ansible.builtin.import_tasks: "{{ playbook_dir }}/_noop-tasks.yml"

Some files were not shown because too many files have changed in this diff Show More