I’ve been after that golden goose of auto-imported transactions from my US banks into a selfhosted financial manager for some time now. Plaid doesn’t work with some of my banks, and comes with a slew of privacy compromises anyway. I’m looking to import transactions into firefly iii (or actualbudget) by scraping information from bank alert emails about my transactions. I wanted to write about it here in case someone had experience doing so or any tips-- or if this is a silly venture.
My plan is to set alerts for all transactions across my banks, and direct them all to a single email address. Then I’ll write a python script that checks the inbox every 5mins or so, and if it detects a new email, it will parse it according to some code I write and extract the amount and the payee, and then attempt to import it into (in this case, ActualBudget) using the importTransactions
API call.
It’s going to be a bit of a pain in the ass to set this up as I see it (I’m also a bit of a beginner, but think I can make it work) and I just want to see if anyone else has tried this. Thanks!
I know that banks in Europe are bound by law to follow PSD2, which is a set of guidelines to propose APIs. I found a stackoverflow post to generate the required certificates for that but those are only supposed to be for testing purposes https://stackoverflow.com/questions/50045376/how-to-create-eidas-certificate-with-qwac-and-qsealc-profiles-psd2-specific-att
You can use the PSD2 api to fetch the transactions from your account directly, that would be a lot less troublesome. There is also the woob (formerly weboob) project that has web scraping for a lot of banks (specifically french but also some American ones like amex)
I’ve been after something like this as well. My bank only provides statements in PDF format which is pretty irritating. I’ve seen this library recommended for extracting tables from PDFs if you are dealing with that issue. I’m not sure if there is a great way to automatically download the statements though.
https://py-pdf-parser.readthedocs.io/en/latest/overview.html
I just went the parsing CSV route, but it’s only semi-automated. Still a lot better than manually entering everything. Annoyingly, every bank uses a different damn format so you need specific config for all of them.