pg的修改
当扩大pg num的时候,有时候会遇到报错:
ceph osd pool set rbd pg_num 4096 Error E2BIG: specified pg_num 3500 is too large (creating 4096 new PGs \ on ~64 OSDs exceeds per-OSD max of 32)
限制pg spliting的参数来源于mon_osd_max_split_count value。
查看配置文件
#ceph daemon mon.ipmi1151 config get mon_osd_max_split_count { "mon_osd_max_split_count": "32" }
计算脚本:
max_inc=`ceph daemon mon.ipmi1151 config get mon_osd_max_split_count 2>&1 \ | tr -d '\n ' | sed 's/.*"\([[:digit:]]\+\)".*/\1/'` pg_num=`ceph osd pool get wtest pg_num | cut -f2 -d: | tr -d ' '` echo "current pg_num value: $pg_num, max increment: $max_inc" osd_num=`ceph osd ls |wc -l` next_pg_num="$(($pg_num+$(($max_inc * $osd_num))))" echo "allowed increment of pg_num: $next_pg_num"
结果输出
current pg_num value: 512, max increment: 32 allowed increment of pg_num: 800
block_size的计算
block_size = mon_osd_max_split_count * n_osds. ( Target PGs per OSD ) x ( OSD # ) x ( %Data ) ------------------------------------------------- ( Size )
If the value of the above calculation is less than the value of ( OSD# ) / ( Size ), then the value is updated to the value of ( OSD# ) / ( Size ). This is to ensure even load / data distribution by allocating at least one Primary or Secondary PG to every OSD for every Pool.
The output value is then rounded to the nearest power of 2.
Tip: The nearest power of 2 provides a marginal improvement in efficiency of the CRUSH algorithm.
If the nearest power of 2 is more than 25% below the original value, the next higher power of 2 is used.