I have two arrays representing the start and length indices (or it could be start and end) in this array that identify sequences of integers that I need to process. The sequences are of variable length.
x=numpy.array([2,3,5,7,9,12,15,21,27,101, 250]) # Puede tener una longitud de millones
starts=numpy.array([2,7]) # Puede tener longitudes de miles
ends=numpy.array([5,9])
# required output is x[2:5],x[7:9] in flat 1D array
# [5,7,9,12,21,27,101]
I can easily do this with for loops, but the app is performance sensitive so I'm looking for a way to do it without the python iterator.
Any help will be gratefully received!
Proposal 1
A vectorized approach would be the masking created with streaming:
Proposal #2
Another vectorized way would be to create ramps of 1s and 0s with cumsum (would be better with lots of start and end pairs), like so:
Proposal #3
Another loop-based one for memory efficiency might be better with lots of entries in start-end pairs:
Proposal #4
For completeness here is another one with a loop to select slices and then assign them to an initialized array and it should be good at slices to be selected from a large array
If the iterations are many, there is a minor optimization possible to reduce the computation per iteration: