DevOps Archives - 论野生技术&二次元

Ubuntu 22.04 / OpenSSH 8.9 使用 gpg-agent 登录报错 agent refused operation 的解决方法

on 2022 年 5 月 3 日

DevOps

0 10255 转为繁体

Ubuntu 22.04 升级了 OpenSSH 到8.9，这个版本默认开启 [email protected] 作为密钥交换（KEX）方法。这个算法使用 512 bit 的 hash。

如果客户端和服务端都升级到了8.9或以上，则成功协商使用这一KEX算法，这时如果使用 gpg-agent 的 SSH 功能签名则会报 agent refused operation。

打印 gpg-agent 的日志可看到报错为 Provided object is too large。

解决方法是在客户端（~/.ssh/config）或服务端（/etc/ssh/sshd_config）中禁用这个算法

KexAlgorithms -sntrup761x25519-sha512@openssh.com

1	KexAlgorithms -sntrup761x25519-sha512@openssh.com

同理 diffie-hellman-group16-sha512 和 diffie-hellman-group18-sha512 也应该被禁用，但它们优先级本来就很低。

如果仍然有问题，把Kex Host Key Algorithm也改一下，如改成

HostKeyAlgorithms ssh-ed25519

1	HostKeyAlgorithms ssh-ed25519

Hashicorp Nomad的坑

on 2020 年 6 月 23 日

DevOps

0 11998 转为繁体

因为组里k8s大佬浓度不够，最后用了Nomad来做容器编排。开个文章记录一下踩过的坑：

network allocation配额没有明确的提示

Nomad的文档以及各种Grafana dashboard都没有提到node上的network allocation其实是有上限的，虽然metrics里是有这一项的（nomad_client_allocated_network/nomad_client_unallocated_network）。具体如何计算尚不明确，可能需要看代码。我们的EC2上有看到500Mb和1000Mb的上限。

如果不指定，默认每个task占用100Mb的速度（见文档），这是一个硬上限，如果node完全被allocate的时候，超过这个限制的容器会被限速。个人觉得Nomad的这个设计是坑爹的，网速这类资源相比于CPU和内存是更加体现突发的特性的，如果只能设置硬性上限，利用率显然会非常低。这个是上个世纪的QoS了吧。

allocation启动时的template re-render

这是一个bug：https://github.com/hashicorp/nomad/issues/5459。如果用了集成的consul-template来做服务发现，某些情况下可能在allocation启动过程中触发re-render，从而nomad client向容器发送信号；但当容器还没起来的时候，nomad client会拒绝发送信号并且把这个容器干掉，并且不会尝试重新启动。

也不知道是哪个神仙想出来的这种奇葩设计。

system类型的task

如果一个task是system类型，那它会在所有满足条件的node上运行。但是它默认的restart参数很容易会因为一些临时性的错误让整个task挂掉，我们重新设置了restart参数

restart {
    attempts = 60       # attempts no more than interval / delay
    mode     = "delay"
    delay    = "5s"
    interval = "5m"
}

restart {

attempts = 60 # attempts no more than interval / delay

mode = "delay"

delay = "5s"

interval = "5m"

}

容器里的单个端口无法映射成多个端口

docker里我们可以把容器里的一个端口映射成任意多个端口；但是nomad无法做到，看起来像是处理job definition时的一个bug（issue链接）。

下面的配置，只有8001端口会被映射；http1这个端口在port_map里被http2覆盖了。

job "test" {
  group "test" {
    task "test" {
      driver = "docker"
      config {
        image = "nginx:latest"
        port_map = {
          http1    = 80
          http2    = 80
        }
      }
      resources {
        network {
          port "http1" {
            static = 8000
          }
          port "http2" {
            static = 8001
          }
        }
      }
    }
  }
}

job "test" {

group "test" {

task "test" {

driver = "docker"

config {

image = "nginx:latest"

port_map = {

http1 = 80

http2 = 80

}

resources {

network {

port "http1" {

static = 8000

}

port "http2" {

static = 8001

}

下面的配置不会报错，但是仍然只有8001会被映射。

job "test" {
  group "test" {
    task "test" {
      driver = "docker"
      config {
        image = "nginx:latest"
        port_map = {
          http1    = 80
        }
      }
      resources {
        network {
          port "http1" {
            static = 8000
            static = 8001
          }
        }
      }
    }
  }
}

job "test" {

group "test" {

task "test" {

driver = "docker"

config {

image = "nginx:latest"

port_map = {

http1 = 80

}

resources {

network {

port "http1" {

static = 8000

static = 8001

}

解决的办法是在容器内开多个端口，分别映射到不同的外部端口。

terraform provider无法检查nomad job的更改

远古bug： https://github.com/hashicorp/terraform-provider-nomad/issues/1

如果在terraform外部修改了nomad的job定义，在terraform provider里是无法检测到的。

不是很懂那我有它何用？

从 MaxMind 新版 GeoIP 数据库转换旧版数据库

on 2019 年 11 月 28 日

DevOps

0 11886 转为繁体

因为MaxMind不再更新v1版的GeoIP数据库，所以自己从v2的CSV文件转格式。

使用的工具是https://github.com/fffonion/geolite2legacy。

城市和ASN数据库可以从这里下载，每日更新。也可以直接使用cidr.me来查询，使用方法可以参阅这篇文章。

ASN数据来自HE BGP toolkit，可以同时查询上一级的ASN。使用的工具是https://github.com/fffonion/GeoIPASNum-Generator。

另外这个老哥也有（每月？）更新的数据库，但是IPv6+IPv4的数据库有问题，应该用的是上游的转换脚本（骗了个PR）。

注意从2019年12月30日开始，需要使用License Key下载数据库。

附更新脚本：

#!/bin/bash
D=$(dirname $(readlink -f $0))
T=$(mktemp -d)
cd $T
F=GeoLite2-City
L=GeoLiteCityv6
# create license key from https://www.maxmind.com/en/accounts/current/license-key
LICENSE=xxxxx
wget "https://download.maxmind.com/app/geoip_download?edition_id=${F}-CSV&license_key=${LICENSE}&suffix=zip" -O ${F}-CSV.zip
unzip ${F}-CSV.zip
rm ${F}-CSV.zip
cd ${F}-CSV_*/
tail -n+2 ${F}-Blocks-IPv4.csv |
    awk -F, '{ split($1,a,"/"); split(a[1],a1,"."); m = 96+a[2]; printf("::ffff:%02x%02x:%02x%02x/%d,%s,%s,%s,%s,%s,%s,%s,%s\n"),a1[1],a1[2],a1[3],a1[4],m,$2,$3,$4,$5,$6,$7,$8,$9}' >> ${F}-Blocks-IPv6.csv
ls -lht
cd $T
zip ${F}-CSV.zip ${F}-CSV_*/*
python $D/geolite2legacy/geolite2legacy.py -i $T/${F}-CSV.zip -o $T/GeoLiteCity.dat -f /home/wow/geolite2legacy/geoname2fips.csv
python $D/geolite2legacy/geolite2legacy.py -i $T/${F}-CSV.zip -o $T/GeoLiteCityv6.dat -f /home/wow/geolite2legacy/geoname2fips.csv -6
for L in GeoLiteCityv6 GeoLiteCity; do
    if [[ -s $T/${L}.dat ]]; then
        cp $T/${L}.dat $D
    fi
done
rm -rf $T

#!/bin/bash

D=$(dirname $(readlink -f $0))

T=$(mktemp -d)

cd $T

F=GeoLite2-City

L=GeoLiteCityv6

# create license key from https://www.maxmind.com/en/accounts/current/license-key

LICENSE=xxxxx

wget "https://download.maxmind.com/app/geoip_download?edition_id=${F}-CSV&license_key=${LICENSE}&suffix=zip" -O ${F}-CSV.zip

unzip ${F}-CSV.zip

rm ${F}-CSV.zip

cd ${F}-CSV_*/

tail -n+2 ${F}-Blocks-IPv4.csv |

awk -F, '{ split($1,a,"/"); split(a[1],a1,"."); m = 96+a[2]; printf("::ffff:%02x%02x:%02x%02x/%d,%s,%s,%s,%s,%s,%s,%s,%s\n"),a1[1],a1[2],a1[3],a1[4],m,$2,$3,$4,$5,$6,$7,$8,$9}' >> ${F}-Blocks-IPv6.csv

ls -lht

cd $T

zip ${F}-CSV.zip ${F}-CSV_*/*

python $D/geolite2legacy/geolite2legacy.py -i $T/${F}-CSV.zip -o $T/GeoLiteCity.dat -f /home/wow/geolite2legacy/geoname2fips.csv

python $D/geolite2legacy/geolite2legacy.py -i $T/${F}-CSV.zip -o $T/GeoLiteCityv6.dat -f /home/wow/geolite2legacy/geoname2fips.csv -6

for L in GeoLiteCityv6 GeoLiteCity; do

if [[ -s $T/${L}.dat ]]; then

cp $T/${L}.dat $D

done

rm -rf $T

A systemd dependency mind blow story

on 2019 年 8 月 3 日

DevOps

0 11666 转为繁体

The following is taken from a production system that we run, which I found might be helpful to others that are enjoying the happiness of using systemd.

Jenkins 中构建有私有模块的Go项目

on 2019 年 5 月 22 日

DevOps

0 11102 转为繁体

更新：

可以使用 athens 来建立全局go modules缓存，管理SSH密钥会更加方便。

go get 的底层会调用git来clone模块，因此我们只要保证git clone repo_url 可以无交互正常运行，就可以让go get 也正常下载模块。

如果是在本地使用，则可以安装hub或者设置将https重写成ssh地址，以自动使用私钥下载，而无需交互输入用户名密码。

如果在Jenkins中使用，就算可以马上使用后删除，任何时候让一个ssh私钥保存在磁盘上都是不安全的。所以我们使用credential.helper + 环境变量，并且用https地址的方式来给git提供用户名密码。

使用credential.helper 可以允许git调用配置的命令获取用户名和密码，我们使用一个一行的shell脚本把环境变量$USERNAME 和 $PASSWORD 打印出来：

git config credential.helper \'!f() { sleep 1; echo "username=${USERNAME}\npassword=${PASSWORD}"; }; f\''

1	git config credential.helper \'!f() { sleep 1; echo "username=${USERNAME}\npassword=${PASSWORD}"; }; f\''

这样，任何时候我们都不会在磁盘上保存用户名和密码，所有信息都在内存里。

然后，因为go get 会clone一个新的repo到本地，我们没有办法在这之前设置每个repo的credential.helper，所以这个配置必须是全局的设置。我们用一个docker容器来完成整个项目，然后把这个配置通过docker volume挂载到$HOME/.gitconfig下：

[credential]
	helper = "!f() { sleep 1; echo \"username=${USERNAME}\npassword=${PASSWORD}\"; }; f"

1 2	[credential] helper = "!f() { sleep 1; echo \"username=${USERNAME}\npassword=${PASSWORD}\"; }; f"

注意Jenkins的docker插件会传递当前的HOME等环境变量，这个目录往往在容器中不存在，所以我们覆盖容器中的用户目录到/tmp。

完整的Jenkinsfile如下：

def GOLANG_VERSION = 1.12

pipeline {
  agent {
    docker {
      image "golang:${GOLANG_VERSION}"
      args '-v ${WORKSPACE}/.gitconfig:/tmp/.gitconfig -e HOME=/tmp'
    }
  }

  environment {
    GO111MODULE = "on"
    GOCACHE = "/tmp/.gocache"
    GOPATH = "${WORKSPACE}"
    PATH = "${GOPATH}/bin:$PATH"
  }

  stages {
    stage('Checkout') {
      steps {
        withCredentials([[$class: 'UsernamePasswordMultiBinding', credentialsId: 'CREDENTIAL_ID', usernameVariable: 'USERNAME', passwordVariable: 'PASSWORD']]) {
          checkout scm

          sh 'git config credential.helper \'!f() { sleep 1; echo "username=${USERNAME}\npassword=${PASSWORD}"; }; f\''
          sh 'git fetch'
        }
      }
    }
      
    stage('Install dependencies') {
      steps {
        sh 'go version'
        script {
          withCredentials([[$class: 'UsernamePasswordMultiBinding', credentialsId: 'CREDENTIAL_ID', usernameVariable: 'USERNAME', passwordVariable: 'PASSWORD']]) {
            sh "go get -v"
          }
        }
      }
    }

    stage('Run tests') {
      steps {
        script {
          sh "go test"
        }
      }
    }
  
    stage('Build') {
      steps {
        script {
          sh "go build -o ${item}"
        }
      }
    }
  }
}

def GOLANG_VERSION = 1.12

pipeline {

agent {

docker {

image "golang:${GOLANG_VERSION}"

args '-v ${WORKSPACE}/.gitconfig:/tmp/.gitconfig -e HOME=/tmp'

}

environment {

GO111MODULE = "on"

GOCACHE = "/tmp/.gocache"

GOPATH = "${WORKSPACE}"

PATH = "${GOPATH}/bin:$PATH"

}

stages {

stage('Checkout') {

steps {

withCredentials([[$class: 'UsernamePasswordMultiBinding', credentialsId: 'CREDENTIAL_ID', usernameVariable: 'USERNAME', passwordVariable: 'PASSWORD']]) {

checkout scm

sh 'git config credential.helper \'!f() { sleep 1; echo "username=${USERNAME}\npassword=${PASSWORD}"; }; f\''

sh 'git fetch'

}

stage('Install dependencies') {

steps {

sh 'go version'

script {

withCredentials([[$class: 'UsernamePasswordMultiBinding', credentialsId: 'CREDENTIAL_ID', usernameVariable: 'USERNAME', passwordVariable: 'PASSWORD']]) {

sh "go get -v"

}

stage('Run tests') {

steps {

script {

sh "go test"

}

stage('Build') {

steps {

script {

sh "go build -o ${item}"

}

Category Archives

Ubuntu 22.04 / OpenSSH 8.9 使用 gpg-agent 登录报错 agent refused operation 的解决方法

Hashicorp Nomad的坑

network allocation配额没有明确的提示

allocation启动时的template re-render

system类型的task

容器里的单个端口无法映射成多个端口

terraform provider无法检查nomad job的更改

从 MaxMind 新版 GeoIP 数据库转换旧版数据库

A systemd dependency mind blow story

Jenkins 中构建有私有模块的Go项目