I have been looking for a way to straighten a scanned image, more precisely a form, I need a function that automatically straightens it, I have tried with
import numpy as np
import numpy as np
import argparse
import cv2
img = cv2.imread('D:/Consecu.jpg',0)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)
thresh = cv2.threshold(gray, 0, 255,
cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]
if angle < -45:
angle = -(90 + angle)
else:
angle = -angle
(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, M, (w, h),
flags=cv2.INTER_CUBIC,
borderMode=cv2.BORDER_REPLICATE)
cv2.putText(rotated, "Angle: {:.2f} degrees".format(angle),
(10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
print("[INFO] angle: {:.3f}".format(angle))
cv2.imshow("Input", image)
cv2.imshow("Rotated", rotated)
cv2.waitKey(0)
But the image does not rotate, it does not find a degree (0,0) of inclination, how could it rotate the frame or detect lines, which organizes them horizontally
I need a function that automatically straightens it and leave it like that
The problem is that the algorithm you are applying, taken from https://www.pyimagesearch.com/2017/02/20/text-skew-correction-opencv-python/ is specifically designed for text like the one shown in that same page:
In the images used in that article, it turns out that all the "ink" pixels are inside a rectangle, and that rectangle is precisely what the code is looking for:
Basically what that code does is detect which is the rectangle that would leave out only the non-ink pixels. The rectangle that it would find would be the one that I mark here in red:
Once the rectangle is found it is easy to find its angle and use it to straighten the text.
Unfortunately this trick doesn't work for you, as the "ink" pixels are scattered all over the page, and in particular there are a lot of black pixels near the corners of the image. As a consequence, the minimum rectangle that encompasses the ink in your case is a rectangle equal to the entire page. That's why the zero angle comes out.
On the other hand, your case has a very interesting feature, which is that since it is a printed form, it contains a large number of horizontal lines (the grid of the form).
Using the Hough transform we can find all those lines. This transform gives you for each line the angle it forms with the horizontal. In fact, I would find many lines in your image, most of them horizontal, but also some vertical ones. For example, this would be what you might find if you limit yourself to lines with more than 1000 pixels:
(The lines have not been drawn completely, only a piece so as not to completely cover the original ones).
Vemos que en algunas zonas (las barras negras anchas) detecta un amasijo de líneas con ángulos variables. Eso podríamos haberlo mejorado si antes de pasarle la imagen hacemos una detección de bordes (ej: canny). Pero no importa mucho en este caso porque el resto de las líneas finas las ha encontrado perfectamente.
Lo que podemos hacer es revisar todas las líneas que ha encontrado y quedarnos con los ángulos que más frecuentemente aparecen, que corresponderá a los de las líneas horizontales de la rejilla. Para este cometido nos viene muy bien la clase
Counter
del módulo estándar pythoncollections
.El siguiente código hace todo lo antes descrito (he optado por volcar el resultado a otro fichero, en lugar de mostrarlo por pantalla, pues lo estoy ejecutando en un servidor sin terminal gráfica):
Esto es lo que sale en el fichero
"corregida.jpg"
(he reducido a la mitad su resolución para pegarla aquí, pues Stack Overflow se quejaba del tamaño):Actualización
Como el usuario reportó algunos casos de imágenes en las que no se giraba correctamente, estuve revisando el algoritmo. He hecho las siguientes mejoras:
With these improvements I have applied it to a few images, rotated in different directions or not rotated, and in all of them it comes out right.
This is the new code (I've refactored it to a function):