Register Login

How to use Glob() function to find files recursively in Python?

Recursively accessing files in your local directory is an important technique that Python programmers need to render in their application for the lookup of a file. This can be done using the concept of the regular expression. Regular Expressions or regex play a significant role in recursively finding files through Python code. In this article, you will learn about the glob() function that helps in finding files recursively through Python code.

What do you mean by the term glob?

Glob is a common term used for defining various techniques used for matching established patterns as per the rules mentioned in the Unix shell. Unix, Linux systems, and the shells are some systems that support glob and also render glob() function in system libraries.

Glob in Python:

From Python 3.5 onwards, programmers can use the Glob() function to find files recursively. In Python, the glob module plays a significant role in retrieving files & pathnames that match with the specified pattern passed as its parameter. The glob's pattern rule follows standard Unix path expansion rules. According to researchers and programmers, a benchmarks test was done and it is found that the glob technique is faster than other methods for matching pathnames within directories. With glob, programmers can operate wildcards ("*, ?, etc.) other than string-based searching to extract the path retrieval mechanism in a simpler and efficient manner.

Syntax: glob() and iglob():

glob.glob(path_name, *, recursive = False)
glob.iglob(path_name, *, recursive = False)

By default, the recursive value is set to false.

Program:

import glob
print('Explicitly mentioned file :')
for n in glob.glob('/home/karlos/Desktop/stechies/anyfile.txt'):
    print(n)
  
# The '*' pattern 
print('\n Fetch all with wildcard * :')
for n in glob.glob('/home/karlos/Desktop/stechies/*\n'):
    print(n)
# The '?' pattern
print('\n Searching with wildcard ? :')
for n in glob.glob('/home/karlos/Desktop/stechies/data?.txt \n'):
    print(n)
# The [0-9] pattern
print('\n Searching with wildcard having number ranges :')
for n in glob.glob('/home/karlos/Desktop/stechies/*[0-9].* \n'):
    print(n)

Output:

Explanation:

First, we have to import the glob module. Then we have to use the glob() method where we will pass the path that will look for all the subdirectories and print it using the print() function. Next, we will use different patterns such as * (asterisk), ? (wildcard), and [range] to the end of the path so that it can fetch and display all the folders existing within that subfolder.

Glob() with Recursive value as True:

import glob
print("Applying the glob.glob() :-")
fil = glob.glob('/home/karlos/Desktop/stechies/**/*.txt', 
                   recursive = True)
for f in fil:
    print(f)
# Returning an iterator that will print simultaneously.
print("\n Applying the glob.iglob()")
for f in glob.iglob('/home/karlos/Desktop/stechies/**/*.txt',
                           recursive = True):
    print(f)

Output:

Explanation:

This is another program to show traversing of directories and sub-directories recursively. First we have to import the glob module. Then we have to use the glob() method where we will pass the path that will look for all the subdirectories and print it using the print() function. Next, we will use different patterns such as ** and * which means all sub-folders and folders from that string of path. The string is the first parameter while the recursive = True is the second parameter that defines whether to traverse all the sub-directories recursively or not. The same goes for the iglob()  which means iterator glob that returns an iterator and yields the same values as glob() without actually storing them all simultaneously.

Conclusion:

Glob() and iglob() are two essential functions that iterate over the path either straightway or recursively depending on the second parameter value (True/False). This is useful than any other manual approach because Python has made it efficient as a method.