What is a promise in Javascript?

Question

Asked: 2020-04-20 10:27:24 +0800 CST 2020-04-20 10:27:24 +0800 CST 2020-04-20 10:27:24 +0800 CST

Python 3.4. Code that searches in a sequence of integers those that are the result of the sum of two squares

772

As a practice PythonI write a code that searches in a sequence which integers are the result of the sum of two (or more) squares. It works fine but I have the following doubts.

I start with a generator and an empty list. The numbers are compared with the content of the list and added to it if they meet certain conditions. Am I correct if this is more efficient than if I initially create a list (instead of the generator). Or does it depend on whether in the first case I prefer more available memory or, in the second, less data processing? (I do not know if I explained well).
How could the code be improved? I'd like to see if someone more advanced (I'm a hobbyist and self-taught) can come up with a noticeably simpler solution. I must have done something stupid.

The program is as follows:

def gen_raices(min, max):
    lista = []

    # Comprueba entre tres opciones:
    # op1 = El entero no está en lista.
    # op2 = Está en lista pero de suma distintas.
    # op3 = Está en lista pero con misma suma.
    def comprueba (r1, r2):
        for d in range(len(lista)):
            if lista[d][0] == r1 + r2:
                for t in range(1,len(lista[d])):
                    if r1 in lista[d][t]:
                        return 3
                return (2, d)
        return 1

    # Añade a la lista las sumas de raices.
    e1 = 1
    while e1 ** 2 <= max:
        for e2 in range(1, max):
            r1 = e1 ** 2
            r2 = e2 ** 2
            if r1 + r2 <= max and r1 + r2 >= min:
                comp = comprueba(r1, r2)
                if comp == 1:
                    lista.append([r1 + r2, (r1, r2)])
                elif comp == 3:
                    continue
                else:
                    lista[comp[1]].append((r1, r2))
        e1 += 1
    lista.sort()
    return lista

print(gen_raices(1,100))

the result is:

[[2, (1, 1)], [5, (1, 4)], [8, (4, 4)], [10, (1, 9)], [13, (4, 9)], [17, (1, 16)], [18, (9, 9)], [20, (4, 16)], [25, (9, 16)], [26, (1, 25)], [29, (4, 25)], [32, (16, 16)], [34, (9, 25)], [37, (1, 36)], [40, (4, 36)], [41, (16, 25)], [45, (9, 36)], [50, (1, 49), (25, 25)], [52, (16, 36)], [53, (4, 49)], [58, (9, 49)], [61, (25, 36)], [65, (1, 64), (16, 49)], [68, (4, 64)], [72, (36, 36)], [73, (9, 64)], [74, (25, 49)], [80, (16, 64)], [82, (1, 81)], [85, (4, 81), (36, 49)], [89, (25, 64)], [90, (9, 81)], [97, (16, 81)], [98, (49, 49)], [100, (36, 64)]]

1 Answers

Voted

abulafia · Answer 1 · 2020-04-21T02:01:41+08:00

preliminary note

First to note that in your question you mention the term "generator", but your code doesn't actually use generators. This word has a very specific meaning in python. A generator is a function that contains a statement yield, which somehow acts as a return, but without terminating the function, which can be resumed later and continue on the line after the yield.

A generator could be used in your case so that, each time you call it, it "generates" the next sum of squares (instead of saving them all in a list and returning the list at the end), but that's not what you're looking for. making.

Improvements in your code

The code is correct and as you say it works, but there are many possible improvements

Most appropriate data types

The types you use to solve the problem are basically lists. Python has others more suitable for speeding up searches.

For example, your list stores for each number, its decomposition as a sum of squares. Thus, an element of the list has, for example, the values [65, (1, 64), (16, 49)], which indicates that 65it breaks down into the sum 1+64and also 16+49.

A more efficient way would be instead of a list to be a dictionary , whose key is each number, and each value the list of possible decompositions into squares. So your dictionary would have (among others) the entry{ 65: [(1,64), (16,49)] }

Checking if an element is in the dictionary is much more efficient than iterating through the list for the same thing. Just look for example if 65 in diccionario.

This makes your function unnecessary comprobar().

# PRIMERA MEJORA. DICCIONARIOS
def gen_raices1(min, max):
    diccionario = {}

    # Añade a la lista las sumas de raices.
    e1 = 1
    while e1 ** 2 <= max:
        for e2 in range(1, max):
            r1 = e1 ** 2
            r2 = e2 ** 2
            if r1+r2 > max:
              continue
            if r1+r2 not in diccionario:
              diccionario[r1+r2] = []
            if (r2,r1) not in diccionario[r1+r2]:
              diccionario[r1+r2].append((r1,r2))
        e1 += 1
    return diccionario

We want to prevent repeated values from going to the same dictionary key (that is, if we already know that 17 is 1+16, don't also put the 16+1). That is why I check that (r2, r1)it is not already in the list associated with diccionario[r1+r2].

Another possibility is to use sets instead of lists. A set is a type of container that, although you put repeated data into it, it only stores one copy of each one. Therefore we could make the values of the dictionary be sets and put the ordered tuples there (so (1,16)and (16,1), once sorted, both come out (1,16)and the set will only put it once.

The code would look like this (I doubt this is more readable, I almost prefer the old version):

# SEGUNDA OPCIÓN. DICCIONARIOS y CONJUNTOS
def gen_raices2(min, max):
    diccionario = {}

    # Añade a la lista las sumas de raices.
    e1 = 1
    while e1 ** 2 <= max:
        for e2 in range(1, max):
            r1 = e1 ** 2
            r2 = e2 ** 2
            if r1+r2 > max:
              continue
            if r1+r2 not in diccionario:
              diccionario[r1+r2] = set()
            diccionario[r1+r2].add(tuple(sorted((r1,r2))))
        e1 += 1
    return diccionario

Finally, note that we have to check if it r1+r2was already in the dictionary or not, since if it wasn't, we have to create a new entry for it (a list or an empty set), while if it was already, what we have to do is add a new couple to the list (or set).

This is simplified by making use of a defaultdict. This is a dictionary that, when you try to access a key that it doesn't have, automatically creates one, of the type you want. For example, if we continue with the case where we are using sets:

# TERCERA MEJORA. DEFAULTDICT y CONJUNTOS
from collections import defaultdict

def gen_raices3(min, max):
    diccionario = defaultdict(set)

    # Añade a la lista las sumas de raices.
    e1 = 1
    while e1 ** 2 <= max:
        for e2 in range(1, max):
            r1 = e1 ** 2
            r2 = e2 ** 2
            if r1+r2 > max:
              continue
            diccionario[r1+r2].add(tuple(sorted((r1,r2))))
        e1 += 1
    return diccionario

Note that we no longer have the if r1+r2 not in diccionario. We directly access the element [r1+r2]as if it existed to update it. If it does not exist, an empty set will automatically be created and then the tuple in question will be added to it.

Reduce unnecessary iterations

In your code you vary r1from 1 to max, and the same for r2. This causes you to be computing many unnecessary cases. All those for which r1**2+r2**2you pass maxare discarded, but with a bit of thought what the loop indices should be, you could have avoided calculating them.

Note that it is enough to iterate r1from 1 to the square root of max, since for everything r1greater than that square root you will have that r1**2it is already greater than max.

On the other hand, it is enough to iterate r2starting at r1instead of starting at 1, to avoid generating duplicates like (1,16)and (16,1). This saves us from having to use sets or having to check if the pair in another order was already saved. It's a great optimization! We can also stop the iteration when reaching the square root of max-r1, since for any value of r2greater than that the sum r1**2+r2**2would exceed max.

The following code implements these ideas (to calculate the square root I have raised to 0.5):

# TERCERA MEJORA. ELIMINAR ITERACIONES INNECESARIAS
def gen_raices3(min, max):
    diccionario = defaultdict(list)
    for e1 in range(1, int(max**0.5)):
        for e2 in range(e1, int((max-e1**2)**0.5)+1):
            r1 = e1**2
            r2 = e2 ** 2
            diccionario[r1+r2].append((r1, r2))
    return diccionario

final optimization

A little extra optimization consists of realizing that the instruction r1 = e1**2is being executed several times because it is inside the second loop, but since in that second loop r1it does not vary, the same thing always comes out, so we are recalculating it unnecessarily. It could be calculated before entering that loop:

# CUARTA MEJORA. LIGERA OPTIMIZACION
from collections import defaultdict

def gen_raices4(min, max):
    diccionario = defaultdict(list)
    for e1 in range(1, int(max**0.5)):
        r1 = e1**2
        for e2 in range(e1, int((max-e1**2)**0.5)+1):
            r2 = e2 ** 2
            diccionario[r1+r2].append((r1, r2))
    return diccionario

Result

In any of the above cases, the result returned by the function is a dictionary (a defaultdictin fact). If you print it without further ado, it won't be sorted from lowest to highest by its keys (since dictionaries in python don't have a prefixed order). If you want it to be sorted, you can convert the dictionary to a list of tuples (the first element of the tuple would be the key, the next element its value), and sort that list.

For example:

resultado = gen_raices4(1,100)
print(sorted(resultado.items()))

To get:

[(2, [(1, 1)]), (5, [(1, 4)]), (8, [(4, 4)]), (10, [(1, 9)]), (13, [(4, 9)]), (17, [(1, 16)]), (18, [(9, 9)]), (20, [(4, 16)]), (25, [(9, 16)]), (26, [(1, 25)]), (29, [(4, 25)]), (32, [(16, 16)]), (34, [(9, 25)]), (37, [(1, 36)]), (40, [(4, 36)]), (41, [(16, 25)]), (45, [(9, 36)]), (50, [(1, 49), (25, 25)]), (52, [(16, 36)]), (53, [(4, 49)]), (58, [(9, 49)]), (61, [(25, 36)]), (65, [(1, 64), (16, 49)]), (68, [(4, 64)]), (72, [(36, 36)]), (73, [(9, 64)]), (74, [(25, 49)]), (80, [(16, 64)]), (82, [(1, 81)]), (85, [(4, 81), (36, 49)]), (89, [(25, 64)]), (90, [(9, 81)]), (97, [(16, 81)]), (98, [(49, 49)]), (100, [(36, 64)])]

Execution times

Out of curiosity, I've timed how long each of these versions takes to run, using timeit(which runs the function 1000 times and keeps the average of the top three, to remove random "noise"). This is what I got:

Your version: 745 µs
Version 1 (list dictionaries): 610 µs
Version 2 (set dictionaries): 641 µs
Version 3 (reduce iterations): 44.7 µs
Version 4 (final optimization): 36.7 µs

conclusion

Using more suitable types such as the dictionary reduces the complexity of the code that is easier to read, but it does not reduce the execution time as much either, just about 100 µs (in the case of using set()instead of lists, things get slightly worse).

Instead stop to think how eliminating unnecessary iterations improves the results dramatically, reducing execution time by an order of magnitude (divide by 10). Note that in this example it's microseconds, and it doesn't seem worth it, but if the problem were longer and your version took 20 minutes to finish, the optimized version would take 1 minute. Compensate!

In any case, as the great sage Donald Knuth said, Premature optimization is the root of all evil . In other words, start by making code readable and easy to understand and only if you really need it to go much faster, consider how to change it to improve its speed. Prioritize readability over speed. After all, that's why we use python instead of C! ;-)

Python 3.4. Code that searches in a sequence of integers those that are the result of the sum of two squares

preliminary note

Improvements in your code

Most appropriate data types

Reduce unnecessary iterations

Result

Execution times

HTML button that sends you to another page

Why do I get the error "Call to undefined function mysql_connect()"?

How to create an HTML button that works as a link?

How to separate a String in Java. How to use split()

Filter by dates in sql server

How to limit the number of decimal places in a double?

For each in JavaScript?

Position footer ALWAYS glued to the footer

Definitive Guide to Type Conversion in Java

How to properly compare Strings (and objects) in Java?