The most adaptable python wordcloud function (steal this!)

Michael ODonnell
2 min readOct 16, 2022

--

wordcloud of last 2000 tweets with the hashtag “#halloween”

After visiting the wordcloud documentation page one too many times (link), I needed a flexible python function to create and save a wordcloud with or without a mask image from a string.

Thus, if you’re looking for an easy, adaptable way to create a wordcloud in python there is a function for you below.

Before diving into the code, here are a couple example wordcloud outputs. First, an unmasked wordcloud of the book The Odyssey:

Second, a wordcloud for the most recent 2,000 tweets that contained the hashtag “#datascience” with a mask image of a computer:

To make these wordclouds, the functionflexible_wordcloud_function is called and it takes the following parameters:

# text: a string to generate your wordcloud
# output_filepath: a string that your wordcloud png will output/save to
# mask_path: a string of your mask’s filepath
# white_mask_background: if you have a mask and the background is not white, set this to False
# width: if you do not have a mask, enter the desired width in pixels of your wordcloud
# height: if you do not have a mask, enter the desired height in pixels of your wordcloud
# background_color: color of your wordcloud background
# colormap: matplotlib colormap to randomly draw colors from for each word
# colormap documenatation here: https://matplotlib.org/stable/gallery/color/colormap_reference.html
# contour_color: color of your mask’s outline
# contour_width: width of your mask’s outline
# collocations: set to True if you would like to see bigrams
# max_words: maximum number of distinct words
# max_font_size: maximum font size
# min_font_size: minimum font size
# prefer_horizontal: from 0.0 to 1.0, do you prefer highly freqeunt words shown horizontally
# include_numbers: set to False if you do not want numbers

The flexible_wordcloud_function function leverages two other functions to prepare your string, which are tokenize_text and lemmatize_text, but you do not need to alter these functions (although you may use them separately for NLP).

The entire code you need to copy and paste into your Jupyter Notebook or .py file is below:

The above is many lines of code for just a wordcloud, but it is both robust and flexible. So, reading through the code may be helpful before running the functionflexible_wordcloud_function . Enjoy!

--

--

Michael ODonnell
Michael ODonnell

Written by Michael ODonnell

Data Scientist, Adjunct Professor, Triathlete.

No responses yet