Tooling Tuesday - Glob

Oh my glob!

Need to find files in a directory with Python, as glob is my witness, you shall!

Ok I'll stop the glob jokes...

So what is it?

Glob - Unix style pathname pattern expansion...or in simpler terms, it is a library that we can use to search drives, directories for files.

Why is this useful?

Well you might be writing a script that looks for certain files and then creates backups in a remote location. You can use glob to read the directry into a list, then pause for x number of minutes before creating a list that we can compare with the first for any changes. Glob can read the files into lists ready for our use.

So how can I use it?

Ok here is a scenario. We have lots of pictures in a directory, called Pictures (funny that!) We shall write a short script that will save the contents of the directory to a list, and print the name of each file as it is saved to the list.

To write this code I am using Mu, an easy to use Python editor aimed at learners which is great to learn Python with.

So we shall start by importing the glob and time libraries. Glob is used to search for files, and time will be used to introduce a pause every time the loop goes round (iterates.)

import glob
import time

So where can we store the list of files that we find in the directory? Well in a list.

files = []

In Python, lists are objects that can store large amounts of data. Data is stored using an index, which starts at zero and can go on and on! We can create, update and destroy lists as they are mutable, unlike tuples which look and act like lists, but they cannot be updated, only created and destroyed.

Now to fill our new files list we use a for loop and it works like so.

  • Each time the for loop goes round.
  • It updates a variable called file with the full path to the picture / file.
  • This is based on finding an picture in the directory C:\Users\pc\Pictures\ with the file extension .png
  • When we run out of pictures that have the .png file extension, the loop will stop.
for file in glob.glob("c:\\Users\pc\Pictures\*.png"):

Windows uses back slashes \ when creating file paths, which confuses Python! To fix this when typing the drive letter, add an extra \ like this C:\\ and all will be well. For Linux users, ignore this as we use forward slashes / :)

Update 16/10/2018

Pythonista Tim Golden, sent me a tweet to correct this section. Windows can use forward slashes so change the line.

for file in glob.glob("c:\\Users\pc\Pictures\*.png"):

to

for file in glob.glob("c:/Users/pc/Pictures/*.png"):

Tim, thanks for the comment!

So now that we have the file path to a picture, we need to add this to the files list and to do that we use the append function.

    files.append(file)

The last two lines will print the file path of our picture to the Python REPL (Shell) and then wait for 0.2 seconds before repeating the steps.

    print(file)
    time.sleep(0.2)

Save your code, and when ready run it!

You should see the list of files appear in the REPL!

There we go!

Super easy to use, really handy and another bit of Python that may save us some time!

Happy Hacking!