What is a promise in Javascript?

Question

NSantos

Asked: 2020-02-28 06:35:35 +0800 CST 2020-02-28 06:35:35 +0800 CST 2020-02-28 06:35:35 +0800 CST

How do I straighten an Image in python

772

I have been looking for a way to straighten a scanned image, more precisely a form, I need a function that automatically straightens it, I have tried with

import numpy as np
import numpy as np
import argparse
import cv2

img = cv2.imread('D:/Consecu.jpg',0)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)

thresh = cv2.threshold(gray, 0, 255,
                       cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]

if angle < -45:
    angle = -(90 + angle)
else:
    angle = -angle

(h, w) = image.shape[:2]
center = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(center, angle, 1.0)

rotated = cv2.warpAffine(image, M, (w, h),
                         flags=cv2.INTER_CUBIC,
                         borderMode=cv2.BORDER_REPLICATE)
cv2.putText(rotated, "Angle: {:.2f} degrees".format(angle),
            (10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)

print("[INFO] angle: {:.3f}".format(angle))
cv2.imshow("Input", image)
cv2.imshow("Rotated", rotated)
cv2.waitKey(0)

But the image does not rotate, it does not find a degree (0,0) of inclination, how could it rotate the frame or detect lines, which organizes them horizontally

I need a function that automatically straightens it and leave it like that

1 Answers

Voted

abulafia · Answer 1 · 2020-02-28T10:05:06+08:00

The problem is that the algorithm you are applying, taken from https://www.pyimagesearch.com/2017/02/20/text-skew-correction-opencv-python/ is specifically designed for text like the one shown in that same page:

In the images used in that article, it turns out that all the "ink" pixels are inside a rectangle, and that rectangle is precisely what the code is looking for:

coords = np.column_stack(np.where(thresh > 0))
angle = cv2.minAreaRect(coords)[-1]

Basically what that code does is detect which is the rectangle that would leave out only the non-ink pixels. The rectangle that it would find would be the one that I mark here in red:

Once the rectangle is found it is easy to find its angle and use it to straighten the text.

Unfortunately this trick doesn't work for you, as the "ink" pixels are scattered all over the page, and in particular there are a lot of black pixels near the corners of the image. As a consequence, the minimum rectangle that encompasses the ink in your case is a rectangle equal to the entire page. That's why the zero angle comes out.

On the other hand, your case has a very interesting feature, which is that since it is a printed form, it contains a large number of horizontal lines (the grid of the form).

Using the Hough transform we can find all those lines. This transform gives you for each line the angle it forms with the horizontal. In fact, I would find many lines in your image, most of them horizontal, but also some vertical ones. For example, this would be what you might find if you limit yourself to lines with more than 1000 pixels:

(The lines have not been drawn completely, only a piece so as not to completely cover the original ones).

Vemos que en algunas zonas (las barras negras anchas) detecta un amasijo de líneas con ángulos variables. Eso podríamos haberlo mejorado si antes de pasarle la imagen hacemos una detección de bordes (ej: canny). Pero no importa mucho en este caso porque el resto de las líneas finas las ha encontrado perfectamente.

Lo que podemos hacer es revisar todas las líneas que ha encontrado y quedarnos con los ángulos que más frecuentemente aparecen, que corresponderá a los de las líneas horizontales de la rejilla. Para este cometido nos viene muy bien la clase Counter del módulo estándar python collections.

El siguiente código hace todo lo antes descrito (he optado por volcar el resultado a otro fichero, en lugar de mostrarlo por pantalla, pues lo estoy ejecutando en un servidor sin terminal gráfica):

import numpy as np
import cv2

# Leer la imagen
imagen = cv2.imread('image.jpg')
# Convertirla a gris e invertirla (negativo)
gray = cv2.cvtColor(imagen, cv2.COLOR_BGR2GRAY)
gray = cv2.bitwise_not(gray)

# Aplicarle un threshold para dejarla binaria
# (los pixels serán 0 ó 255)
binaria = cv2.threshold(gray, 0, 255,
                cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]


# Usar la transformada de Hough para encontrar líneas
# en la imagen binarizada, con una resolución de medio
# grado (pi/720) y quedándose sólo con las líneas que
# alcancen puntuación de 1000 o más (que serán las
# más largas)
lineas = cv2.HoughLines(binaria, 1, np.pi/720, 1000)

# Recopilemos qué ángulos ha encontrado la transformada
# de hough para cada una de las líneas halladas
angulos = []
for linea in lineas:
    theta = linea[0][1]
    angulos.append(theta)

# Ahora contemos cuántas veces aparece cada ángulo
from collections import Counter
veces = Counter(angulos)

# Y quedémonos con el ángulo que más veces se repite
angulo = veces.most_common()[0][0]

# Cambiar el sentido de la rotación si el ángulo es mayor de 180º
if angulo > np.pi/2:
   angulo = -angulo
print("[INFO] angulo: {:.5f}".format(angulo))

# Ahora enderecemos la imagen, girando (en negativo) el ángulo detectado
(h, w) = imagen.shape[:2]
centro = (w // 2, h // 2)
M = cv2.getRotationMatrix2D(centro, -angulo, 1.0)
girada = cv2.warpAffine(imagen, M, (w, h),
            flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)

# Y volcamos a disco el resultado
cv2.imwrite("corregida.jpg", girada)

Esto es lo que sale en el fichero "corregida.jpg" (he reducido a la mitad su resolución para pegarla aquí, pues Stack Overflow se quejaba del tamaño):

Actualización

Como el usuario reportó algunos casos de imágenes en las que no se giraba correctamente, estuve revisando el algoritmo. He hecho las siguientes mejoras:

En lugar de binarizar la imagen mediante un threshold he aplicado un filtro Canny. Esto también convierte la imagen a fondo negro y líneas blancas, pero las líneas son los bordes (cambios de blanco a negro) en la imagen original. Esto evita que las zonas donde hay franjas de tinta anchas Hough encuentre muchas líneas solapadas.
En lugar de quedarme con el ángulo más veces repetido, me quedo con los tres más veces repetidos y calculo su promedio.
I transform the angle before applying it to the image, subtracting π/2 from it and most importantly converting it to degrees before calculating the transformation matrix. This is crucial. It was a bug just like it did before (passing it radians) and it's miraculous that despite everything and by chance it spun the correct amount.

With these improvements I have applied it to a few images, rotated in different directions or not rotated, and in all of them it comes out right.

This is the new code (I've refactored it to a function):

import numpy as np
import cv2


def estan_cercanos(a1, a2, error):
    cases = np.unwrap([a2-error, a1, a2 + error])
    return cases[0] <= cases[1] <= cases[2]

def enderezar(entrada, salida):
    # Leer la imagen
    imagen = cv2.imread(entrada)

    # Convertirla a gris y detectar bordes
    gray = cv2.cvtColor(imagen, cv2.COLOR_BGR2GRAY)
    binaria = cv2.Canny(gray,50,150,apertureSize = 3)

    # Usar la transformada de Hough para encontrar líneas
    # en la imagen binarizada, con una resolución de medio
    # grado (pi/720) y quedándose sólo con las líneas que
    # alcancen puntuación de 1000 o más (que serán las
    # más largas)
    lineas = cv2.HoughLines(binaria, 1, np.pi/720, 1000)

    # Recopilemos qué ángulos ha encontrado la transformada
    # de hough para cada una de las líneas halladas
    angulos = []
    for linea in lineas:
        rho, theta = linea[0]
        if rho<0:
            theta = -theta

        # Quedarse solo con las rayas próximas a la horizontal
        # (con un error de +-10 grados)
        if not estan_cercanos(theta, np.pi/2, np.deg2rad(10)):
           continue;

        angulos.append(theta)

    # Ahora contemos cuántas veces aparece cada ángulo
    from collections import Counter
    veces = Counter(angulos)

    # Quedémonos con los tres casos más frecuentes
    frecuentes = veces.most_common(3)

    # Y calculemos el promedio de esos tres casos
    suma = sum(angulo*repeticion for angulo,repeticion in frecuentes)
    repeticiones = sum(repeticion for angulo, repeticion in frecuentes)
    angulo = suma/repeticiones

    angulo = np.rad2deg(angulo - np.pi/2)
    print("[INFO] angulo: {:.5f}".format(angulo))

    # Ahora enderecemos la imagen, girando el ángulo detectado
    (h, w) = imagen.shape[:2]
    centro = (w // 2, h // 2)
    M = cv2.getRotationMatrix2D(centro, angulo, 1.0)

    girada = cv2.warpAffine(imagen, M, (w, h),
                flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)

    # Y volcamos a disco el resultado
    cv2.imwrite(salida, girada)

How do I straighten an Image in python

Actualización

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?