Email inbox overflowing with messages

Summary

I assume I am not alone in having a dreaded “spam” email; an AOL or Yahoo inbox that gathers thousands of marketing messages, newsletters, and occasional 2FA codes and password resets. I’ve wanted to clean them out for years, but the task has always seemed too daunting. Maybe I’m crazy, but my brain won’t allow me to Ctrl+A + Mark as Read – what if there’s something important I’ve skipped over? This exercise effectively proves that I could have done that without recourse, but knowing for sure allows me to sleep at night. I decided to document my approach for when I inevitably get behind on this in 5 years. Hopefully, it may benefit someone else. I’ve scripted out most of this process and combined it with Windows Sandbox to make it “relatively” painless.

Overview of the process:

  1. Download Thunderbird
  2. Use IMAP to connect to an email account
  3. Run a python script to parse through the mailbox file and find the most prolific senders
  4. Quickly “mark as read” large chunks of similar messages from the same sender
  5. Manually read whatever is left

Prerequisites

  1. Windows 10 / 11
    • Pro or Enterprise (Home and Education editions do not support Windows Sandbox)
    • Version 1903 or later
  2. Windows Sandbox
    • Installation Guide
    • Note: Sandbox environments are ephemeral
      • Any installed software, data, or changes will be discarded when the sandbox is closed

This guide also assumes that you have a basic understanding of using a command line interface and Windows File Explorer. If you do not, you can follow along with the screenshots and descriptions, but you may need to do some additional research to understand what is happening.

Step 1: Clone or Download the Repository

Clone this Github repository: https://github.com/robbycuenot/frequent-senders

If you don’t know what that means, download this zip file: https://github.com/robbycuenot/frequent-senders/archive/refs/heads/main.zip

Extract the contents of the zip file.

Step 2: Open Windows Sandbox

Inside of the zip file or repository, you will find a file called sandbox.wsb. If you have Windows Sandbox installed correctly, you should be able to double-click this file to launch a sandbox. This will open a new instance, download a Powershell script, and execute it. This script will do the following:

Install:

Download:

Apply:

  • Windows taskbar icon layout
  • Left-aligned taskbar
  • New default Powershell directory
    • (C:\Users\WDAGUtilityAccount\Desktop)

Disable:

  • Chat and Taskview icons
  • Windows 11 Context Menus

The terminal window will display “Setup Complete!” when the script has finished.

Setup complete

You may close the terminal window if you wish.

Step 3: Connect to your email account

Open Thunderbird by clicking on the desktop or taskbar icon. You will be prompted to set up an existing email address. Steps may vary at this point. Every provider that I tested with allowed for OAuth authentication (browser pop-up, log in, allow Thunderbird to read/write messages). Legacy / corporate / self-hosted email servers may not. If your provider does not support OAuth, you will need to research how to connect it to Thunderbird.

The easiest way to find out is to test it. Fill out only the name and email address of your account, then hit Continue. In this example, I am using an AOL address. IMAP server settings should populate, and the radio button should be checked. Click Done. If OAuth is supported, a browser window will open and prompt for your credentials.

Allow Thunderbird to access your account.

Thunderbird should now display your messages.

Step 4: Update IMAP Settings for AOL and Yahoo (optional)

At this point, Thunderbird will attempt to download all of your messages. However, some servers limit the amount that third-party clients can retrieve (10,000 in the case of AOL and Yahoo). If you have fewer than 10,000 total messages or are not using AOL or Yahoo, proceed to step 5.

You will need to change your IMAP settings to retrieve the entire inbox. In the bottom left corner, click the Settings icon > Account Settings > Server Settings. Change the Server Name to export.imap.aol.com for AOL or export.imap.yahoo.com for Yahoo. Click anywhere outside that text box, and you will be prompted to restart Thunderbird. Upon restart, you should see a progress indicator at the bottom of the window, which displays the new, higher message count.

Step 5: Take a Backup (optional)

Before running a script against any system, I strongly recommend taking a backup. If you feel like raw-dogging this process, you are doing so at your own risk. I’ve intentionally written this script to only touch a copy of the mailbox placed on the Desktop. So, while it is improbable that your actual mailbox could be corrupted and pushed back to the IMAP server, I recommend putting a full copy somewhere safe before proceeding. Unfortunately, you must wait for Thunderbird to synchronize all messages down to the Sandbox before you can take a backup. Depending on your connection speed and the rate-limiting of your email provider, this could take several hours. In my experience, it’s been about an hour per 10,000 messages.

Once synchronized, open a new File Explorer window. Navigate to %APPDATA% by pasting it into the path bar at the top. Windows Sandbox allows you to copy and paste files to the host OS – copy the Thunderbird folder to a location of your choice. I’ve placed mine on my Desktop, which auto-uploads to OneDrive, for additional safety.

Step 6: Save your ‘Unread’ MBOX file

In Thunderbird, right-click on Local Folders and select New Folder. Name the folder INBOX (in all caps) and click OK.

On the left-hand sidebar, select the folder or folders under the name of the email account that you want to process. In my case, I chose the Inbox folder. Open the Quick Filter menu and select Unread. Press Ctrl+A to select all unread messages, and then right-click and select Copy To > Local Folders > INBOX. This step will copy all unread messages to the INBOX folder you created in the previous step.

Open a new File Explorer window. Navigate to %APPDATA% by pasting it into the path bar at the top.

From here, navigate to Thunderbird\Profiles\*.default-release\Mail\Local Folders. You should see a file called INBOX, without an extension. Copy this to the desktop of the Sandbox, as the python script will be expecting it there.

Step 7: Run the python script

You may proceed with the following commands if you still have the initial terminal window open. Otherwise, open a new terminal window by clicking this icon in the taskbar:

Run the following command:

python .\frequent-senders.py

Assuming you have done everything correctly up until this point, the script should kick off without issue. Once the mailbox has been loaded into memory, a progress bar will appear. For very large mailboxes, this may take several minutes.

Terminal Progress

When complete, VSCode will open automatically with four auto-generated files:

  • domains.csv
  • subdomains.csv
  • addresses.csv
  • report.pdf

The report shows a visual representation of the top 10 most common domains, subdomains, and addresses that have sent mail to your account.

PDF Report

The CSV files contain the raw data used to generate the report, including tallies for every single sender.

CSV Report

Using this data, I was able to knock out thousands of messages with similar content relatively quickly. LinkedIn clearly spams me to death.

Conclusion

A wise man once said: “Never spend 6 minutes doing something by hand when you can spend 6 hours failing to automate it”.

I spent way more than six hours on this 😅 but hopefully it will save others some time. If you have any questions or suggestions, PRs are welcome.

Github Repository

Thanks for reading!