I am preparing a view Amazon Redshift
that obtains the queries that have been executed in the last half hour, since there is a script programmed crontab
that inserts this data in a statistics table.
The problem is that I can't get these queries within a limited time. That is to say: the scheduled script is executed every 30 minutes, so I try to get the queries that have been from the current moment (when the script is launched) until half an hour ago.
This is the query I have in that view:
SELECT stl_query.userid,
stl_query.query,
stl_query."label",
stl_query.xid,
stl_query.pid,
stl_query."database",
stl_query.querytxt,
stl_query.starttime,
stl_query.endtime,
stl_query.aborted,
stl_query.insert_pristine
FROM
stl_query, stl_wlm_query
where
stl_query.starttime BETWEEN (getdate())
and (dateadd('microsec', -1800000000, getdate()))
This query does not return any value.
The other version I had is this:
SELECT stl_query.userid,
stl_query.query,
stl_query."label",
stl_query.xid,
stl_query.pid,
stl_query."database",
stl_query.querytxt,
stl_query.starttime,
stl_query.endtime,
stl_query.aborted,
stl_query.insert_pristine
FROM
stl_query, stl_wlm_query
WHERE
stl_query.query = stl_wlm_query.query
AND stl_query.starttime <= dateadd('microsec', -1800000000, getdate())
But the values it throws me are older than the time period I need.
EDIT: I just re-launched a new, more simplified query, to see where the problem is. I have tried to cast the condition to date, but curiously, put <
or >
always returns previous values:
select stl_query.userid,stl_query.starttime from stl_query
where starttime > cast(dateadd(min, -30, getdate()) as date)
order by stl_query.starttime desc
That is, I have launched the query at 12:40 and these are the last 5 results (you have to take into account that the server time is UTC and my time zone is UTC+2 (Spain), hence the delay in launching it at 12:40 and leave at 10:40:
2019-05-20 10:40:00
2019-05-20 10:39:56
2019-05-20 10:39:53
2019-05-20 10:39:51
2019-05-20 10:39:51
Finally, I have found the solution to the problem interpreting the values that it threw me. The bug in the where(
stl_query.starttime <= dateadd('microsec', -1800000000, getdate())
) condition that I posted in the question is that itgetdate()
gets the local system time (in my case,UTC+2
).Because the Redshift cluster is in
UTC
, the time indicated as a condition was never present in the data. I have modified the query as follows:In this way, what I do is:
getdate()
UTC+2
indicating that its time zone isEurope/Madrid
and that I want to convert it to UTC with the functionconvert_timezone
dateadd
the necessary 30 minutes in microseconds (redshift field time unit)stl_query.starttime
.Now I get the correct data.