Good friends, today I found myself with a problem that has me a little thoughtful. Turns out I've created a simple way to capture the referer
and count how many times that url
page has been visited.
This is the function:
function datosreferer(){
if(isset($_SERVER['HTTP_REFERER']) && $_SERVER['HTTP_REFERER'] !== null){
$re = $_SERVER['HTTP_REFERER'] ;
$Z = md5($re);
$A = substr($Z,0,2);
$B = substr($Z,16,1);
$C = substr($Z,30,1);
$D = substr($Z,23,1);
$shortcut = $A.$B.$C.$D;
$contents= false;
$refer = "./Data/$shortcut";
if(file_exists($refer) == true){
$row = file($refer,FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
$referer = $row['0'];
$count = $row['1'];
$count++;
$contents .= $referer."\r\n";
$contents .= $count;
file_put_contents($refer,$contents);
}else{
$count = 0;
$contents .= $re."\r\n";
$contents .= $count;
file_put_contents($refer,$contents);
}
return true;
}else{
return false;
}
}
As you can see, the way is very basic since it creates a file that contains the url referer
if it exists and a number that increases if it is the same url, so far so good.
This is how I call the function from the page where the visitors arrive:
if (isset($_GET['id'])) {
@$id = isset($_GET['id']) ? $_GET['id'] : '';
datosreferer();
}else{
header('Location:'.$conf['redirect'].'');
}
If the id
exists in the url the función
.
The problem is that for each visit the function adds more than one number without repeating the visits. I hope you give me some idea where the problem is.
What I see is that you are killing the entropy of the md5.
Reducing it to 5 characters results in 16^5 possibilities (just over a million), but added to the fact that you always take positions 0, 16, 30 and 23 in the same order results in a very high probability that any other referer collide with those 5 characters, mysteriously increasing the counter.
To demonstrate this, the following code generates random strings of a reasonable length for a domain name without considering the more repeatable parts like www. and COM; the exercise consistently gives me a collision probability between 1% and 5%. In other words, 1 out of every 100 referers will add another referer.
If instead of random strings we used dictionary words -as we normally find in URLs-, the probability would increase.
Because of the above, and regardless of whether that is the cause of your problem, you should use the resulting hash without further processing.
I added
PHP_EOL, FILE_APPEND | LOCK_EX
to yourfile_put_contents
to go concatenating on the same file.Take the total size of the array and subtract 1 from it to find the position where I will get the hit number. Since if you make only one
sizeOf
it will show you the number without counting the number 0 which is where the array starts.So to know the size of the file.
That's how it was to position myself in the place where the value of the counter is found.
Here it is shown more clearly what I mean about the arrangement:
Here the complete code.