I need to access a set of files.
How could I iterate through the contents of a file?
Is it possible to filter the files that I want to obtain? For example only the
.jpg
What if you want all the files in the different subfolders? That is, everything that contains all subfolders " Class_X ":
imagenes
|
|
|-------- train
| |
| |----- Clase_1
| | |
| | imagen_1.jpg
| | imagen_2.png
| | datos.csv
| | ... etc
| |
| |----- Clase_2
| | |
| | imagen_1_1.jpg
| | imagen_2_1.png
| | datos.csv
| | ... etc
|
|
|-------- test
| |
| |----- Clase_1
| | |
| | imagen_1_t.jpg
| | imagen_2_t.png
| | datos.csv
| | ... etc
| |
| |----- Clase_2
| | |
| | imagen_1_2_t.jpg
| | imagen_2_2_t.png
| | datos.csv
| | ... etc
The most important module for accessing files and folders from Python is the
os
. This module comes by default when we install Python ( built-int modules ). The module is also widely usedglob
. I'm going to go on to answer the questions using the example in the question.1. How to loop through what's in a folder?
We can do it using the module
os
and its functionlistdir
to this function we pass the path of the folder that we want to have its files. For example, let's say we want to access the folderClase_1
belonging totrain
:Output:
["imagen_1.jpg", "imagen_2.png", "datos.csv", ...]
We can also do the same with
glob
, in this case we ask you through the special character*
to include everything in the folderOutput:
["imagen_1.jpg", "imagen_2.png", "datos.csv", ...]
2. How to filter the files?
We can simply create a loop
for
with a conditionalif
to filter the files we want, for example if we only want the.png
:Output:
["imagen_2.png", ...]
It could also be done with
glob
and easier, since this function implements the loopfor
and the conditional for us:Output:
["imagen_2.png", ...]
Finally for this answer, I leave respectively how it would be done with
os
andglob
if we wanted to obtain more than one type of file, for example:.png
and.jpg
:3. How to get different subfolders and files?
In this case, it would be necessary to use
os.walk()
this function, the path of a directory is passed to it and it gives us everything it contains by levels , this means that this function returns us a tuple of tuples, which contains the accessed levels, of such that we can unpack in the loopfor
at whatever level we want, in this example, the images:Output
And we could flatten this list of lists, in multiple ways to get the final result obtained, Here I create a function
lambda
for it:Output:
1. How to tour?
I propose that both with the module
glob
and with the methodglob
ofPath
the module methodpathlib
.First case:
But this works for versions >= to 3.5. Otherwise you can just specify the necessary subdirectories:
In the second case:
Resulting in:
2. Is it possible to filter the files that I want to obtain? For example only the .jpg
My proposal is to use what I mentioned above, but highlighting the fact that a jpg filename (trusting that the extension corresponds to the file type) will end with the string ".jpg".
Then:
Resulting in the list of strings:
3. What if you want all the files in the different subfolders?
So we just remove the simple filters we put in:
Resulting in the list of strings:
Note how I play with the pattern that I pass as a parameter in the
glob
.