I have the following shell script that rescues the 200 tables with the worst statistics, to later apply a analyze
:
echo `date "+%Y/%m/%d %H:%M:%S"`" - Comienza el proceso $job" >> $log_ejecucion
echo "------------------------------------------------------------------------" >> $log_ejecucion
echo `date "+%Y/%m/%d %H:%M:%S"`" - Rescatamos las 200 tablas con mayor nivel de estadisticas malas" >> $log_ejecucion
LC_ALL=en_US.UTF-8 LD_LIBRARY_PATH=/usr/lib/ psql -h ${RSHOST} -p 5439 -U ${RSUID} ${RSDB} -t -A -c "select \"schema\",\"table\",stats_off,size from svv_table_info order by stats_off desc limit 200;" > /ruta1/tablas.tmp
echo `date "+%Y/%m/%d %H:%M:%S"`" - Abrimos el bloque de transaccion" >> $log_ejecucion
LC_ALL=en_US.UTF-8 LD_LIBRARY_PATH=/usr/lib/ psql -h ${RSHOST} -p 5439 -U ${RSUID} ${RSDB} -t -A -c "BEGIN;" >> $log_ejecucion
echo `date "+%Y/%m/%d %H:%M:%S"`" - Establecemos el threshold a 0" >> $log_ejecucion
LC_ALL=en_US.UTF-8 LD_LIBRARY_PATH=/usr/lib/ psql -h ${RSHOST} -p 5439 -U ${RSUID} ${RSDB} -t -A -c "set analyze_threshold_percent to 0;" >> $log_ejecucion
echo `date "+%Y/%m/%d %H:%M:%S"`" - Ejecutamos ANALYZE en cada una de ellas" >> $log_ejecucion
while read tabla
do
SCHEMA=$(echo $tabla | cut -d "|" -f 1)
TABLE=$(echo $tabla | cut -d "|" -f 2)
LC_ALL=en_US.UTF-8 LD_LIBRARY_PATH=/usr/lib/ psql -h ${RSHOST} -p 5439 -U ${RSUID} ${RSDB} -t -A -c "analyze $SCHEMA.$TABLE;" >> $log_ejecucion
done < /ruta1/tablas.tmp
echo `date "+%Y/%m/%d %H:%M:%S"`" - Cerramos el bloque" >> $log_ejecucion
LC_ALL=en_US.UTF-8 LD_LIBRARY_PATH=/usr/lib/ psql -h ${RSHOST} -p 5439 -U ${RSUID} ${RSDB} -t -A -c "COMMIT;"
echo `date "+%Y/%m/%d %H:%M:%S"`" - Fin del proceso $job - Elimino el fichero tablas.tmp" >> $log_ejecucion
rm -f /ruta1/tablas.tmp
echo "---------------Proceso terminado correctamente" >> /ruta1/analyze_200tablas_logfail.log
echo `date "+%Y/%m/%d %H:%M:%S"`" - Termina el proceso $job" >> $log_ejecucion
echo "------------------------------------------------------------------------" >> $log_ejecucion
The problem is that for Redshift to do the correctly analyze
, I have to set the analyze_threshold_percent
to 0. As in the previous version of the script, when checking the log, I always found that there were too many Analyze skip
, it occurred to me to put the process inside a block begin...commit
.
Still, it still crashes as there are many in the log Analyze skip
, which shouldn't happen if you ran all two hundred analyze with the correct analyze_threshold_percent value of 0.
I don't know if it will work well since I don't have access to your DB, but maybe this simpler version of your script will help you:
I am thinking that the problem could be solved by executing a single call to the base.