Tuesday Tooling: I'm a Massive Faker, With Python

Need to make a lot of fake user data quickly? You need Faker.

So What Is It?

Faker, by Daniele Faraglia, is a Python module that generates fake data. Not for nefarious means, rather it is there to help you generate data to test your databases etc.

Thanks

Hat tip to Mike Driscoll for alerting me to this very handy module. Mike is a prolific Python educator who uses Twitter / X as a quick and digestible means to learn new Python techniques and modules.

So How Do I Install It?

Via pip!

On my Ubuntu laptop I chose to install it inside a virtualenv so first I had to create the venv.

python3 -m venv faker

Then change directory to the venv.

cd faker

Then I activated the venv, essentially enabling the me to use the Faker module without it touching the Python install on my laptop.

source bin/activate

Now my Terminal session tells me that I have activated the venv, I can see the (faker) text at the start of the prompt.



Now I use pip to install.

pip install faker

So How Do I Use It?

We'll start with the basics. Generating a fake user. For this we need their name. We import the Faker module.

from faker import Faker

Then we create an object called fake and use it as a means to work with the Faker() module.

fake = Faker()

Then I can generate a fake name.

print(fake.name())

I use a for loop to generate a few names to illustrate how it can be automated.

So What Else Can It Do?

Faker is not just about names. We can generate...

  • Fake addresses
  • Fake registration (license) plates
  • Fake barcodes
  • Fake ISBN book numbers
  • Fake email addresses

And much more. Take a look at the Standard Providers section of the documentation to learn more.

These are not "fake" as in counterfeit. Instead they are fake, pseudo-random creations to test databases, not for illegal activities.

Keeping It Local



Data can be generated using many different languages, so if you need to generate names using different languages, Faker can do just that.

When creating the 'fake' object we can set the locales as arguments.

from faker import Faker
fake = Faker(['it_IT', 'en_US', 'ja_JP'])
for _ in range(10):
    print(fake.name())

Quick Demo: Generating a CSV to Import Into a Database

Our scenario is that our boss wants 100 names, addresses and countries to add to a test database. We could sit there and generate them all manually, heck, we have done that in the past. But with Faker we can do it all in a few lines of Python, and then export the data to a CSV.

We start by importing the csv and faker modules, then we create the 'fake' object.

import csv
from faker import Faker
fake = Faker()

We then open (or create if it does not exist) a new file called fakedb.csv. This file is writable ('w').

with open('fakedb.csv', 'w', newline='') as file:

We then create an object called 'writer' and use that as a means to write data to the CSV file.

    writer = csv.writer(file)

Then we create the column headings for the CSV file. We need the name, address and country. Note that there are commas between the words. CSV needs commas. These names (fields) are written to the CSV file, only once as they are the headings.

    field = ["Name", "Address", "Country"]
    writer.writerow(field)

Using a for loop which iterates 100 times we generate the fake name, address and country for our fake users.

    for i in range(100):
        writer.writerow([fake.name(), fake.address(), fake.country()])

Complete Code Listing

import csv
from faker import Faker
fake = Faker()

with open('fakedb.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    field = ["Name", "Address", "Country"]
    writer.writerow(field)
    for i in range(100):
        writer.writerow([fake.name(), fake.address(), fake.country()])

I ran the code and it went through cleanly. No errors. Then I imported the data into LibreOffice Calc and here we go.



I now have 100 Fake user details for our database test. Using the Standard Providers I could add fake bank details, credit cards, even fake passport data. This is an exceptionally powerful tool for database admins, or interns looking to take a few hours off ;)

Happy Hacking