Convert Bank Transactions XLS to CSV in Python
I’ve written previously on importing transactions to hledger/ledger from KBC bank in JavaScript and PTSB bank in Python. I took different approaches to each:
- For KBC, you needed to log in and run Javascript which scrape the transaction table and download it formatted as CSV
- For PTSB, the script automatted logging in, get the transaction table and save locally as CSV
Both approaches are valid - but suffer from the same issues: any change the bank makes to it website needs to be updated in the code. The KBC/JavaScript approach was a bit more robust in that it would just search for rows on a website and download as CSV.
With the EU payment services directive (PSD2) - banks have had to add two-factor authentication which hampers using an automated approach to log in and download. I’ve updated my original PTSB script to handle 2FA - buit it’s taken the magic out of it and made it feel very manual. So I’ve decided to change the approach again - just download the Excel transaction file the provide and convert to CSV. It should be much easier to maintain - I just need to be careful of which row transactions start/end and which columns to ignore.
Here’s my latest Python script to convert Permanent TSB Excel file to CSV:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
import glob
import xlrd
import csv
def read_xls(file, start_row):
wb = xlrd.open_workbook(file)
sh = wb.sheet_by_index(0)
rows = []
for rownum in range(start_row, sh.nrows):
li = sh.row_values(rownum)
del li[1]
del li[-2]
rows.append(li)
del rows[-1]
return rows
def write_csv(filename, header, rows):
csv_file = open(filename, 'w', encoding='utf8')
wr = csv.writer(csv_file)
wr.writerow(header)
for row in rows:
wr.writerow(row)
def main():
path = "*.xls"
start_row=13
rows = read_xls(glob.glob(path)[0], start_row)
header = ['Date', 'Desc', 'Amount In', 'Amount Out', 'Balance']
filename = 'ptsb.csv'
write_csv(filename, header, rows)
if __name__ == '__main__':
main()