Lots of spurious partial addresses with Gmail Takeout mbox

pkuras

New Member
Hello,

I am using Email Extractor to get addresses from a mbox file generated by Google's data takeout function.

The process works, but I find that I get a lot of partial addresses - some truncated at the beginning and some at the end. Examples:

[email protected]
[email protected]
[email protected]

[email protected]
[email protected]

[email protected]
[email protected]

[email protected]
[email protected]

I also see a lot of what appear to be SMTP message IDs:
[email protected]
caj2qrmy1pyif6nv2swrphplopwn5erocmuneyp ... .gmail.com
f26551f62e62400b91a6ee2d6c35fb48@dm2pr0 ... utlook.com
dc115947bc87b943b8d95092c02cad53a63952e ... ompany.com

There's far too much junk in here for this to be usable. Is there any way of resolving this?
 

stanbusk

Administrator
Staff member
Re: Lots of spurious partial addresses with Gmail Takeout mb

Why do you use mbox files? Why not saving the messages as plain text files from the software? That way you will generate plain text files.
 

pkuras

New Member
Re: Lots of spurious partial addresses with Gmail Takeout mb

The use of mbox files is dictated by the fact that I am working with a full mailbox export from gmail. That's the format they provide. Also, an mbox file is just a big text file.
 

stanbusk

Administrator
Staff member
Re: Lots of spurious partial addresses with Gmail Takeout mb

Yes but the file is not right since you would not get that problem. eMail Extractor does nothing to the file, it just extracts emails. Emails are peaces of text between spaces, line breaks or some other few characters that contain a '@', a domain name and an extension. Look at you mbox file with a text editor, search for one of those addresses, how does it appear there?
 
Top