O plugin universal bpipe permite que Bacula receba qualquer fluxo de dados da saída padrão para seu armazenamento de backup, incluindo arquivos do cluster Hadoop HDFS com o máximo desempenho.
#!/bin/bash # # This script provides hdfs file copies do Bacula bpipe plugin (FIFO) using multiple hdfs cat commands when backing up and multiple put commands to restore. # Next backups will only copy changed files from hdfs after last backup recorded time (/etc/last_backup). # # Remark: hdfs /tmp and .tmp. folders are excluded by the grep -v. # # By Heitor Faria (http://bacula.us | https://bacula.lat); # Marco Reis; # Julio Neves (http://livrate.com.br) and # Rodrigo Hagstrom # # Tested with Hadoop 2.7.1; August, 2017. # # It must be called at the FileSet INCLUDE Sub-resource, used by the job that # backups a Hadoop node with a Bacula Client, like this (e.g.): # # Plugin = "\|/etc/script_hadoop.sh" hdfs="/etc/hadoop/bin/hdfs" if [[ ! -p /etc/last_backup ]]; then echo "00-00-00;00:00" > /etc/last_backup fi Date=$(cat /etc/last_backup | cut -f 1 -d ";") Hour=$(cat /etc/last_backup | cut -f 2 -d ";") for filename in $($hdfs dfs -ls -R / | awk -v date="$Date" '$6>=date && $2!="-" {print $7 " " $8}' | awk -v hour="$Hour" '$1>=hour {print $2}' |grep -v -e /tmp/ -e .tmp.) do echo "bpipe:/var$filename:$hdfs dfs -cat $filename:$hdfs dfs -put -f /dev/stdin $filename" done date '+%Y-%m-%d;%H:%M' > /etc/last_backup
Disponível em: Español