March 2024 – Sean Novak

Day One: Text Wrangling

We are starting with a wholly unusable PDF File (https://docs.house.gov/billsthisweek/20240318/WDI39597.PDF). First we need to parse it into usable data. Goal 1: Read and parse the document Goal 2: Extract sections. Read and parse the document What are some readily available open source projects that I can use to parse PDFs into text? Let’s try some …

Continue reading “Day One: Text Wrangling”

US Spending Visualizations

This week another Uniparty Omnibus spending bill was passed without much a fuss. I was thinking Speaker Johnson was going to be a force to stand up to the machine and reduce spending. I thought he was going change things. I may have been mistaken. 😞 We need to get inflation under control, its like …

Continue reading “US Spending Visualizations”