The Audubon Society’s annual Christmas Bird Count (CBC) is the longest annual volunteer species count in the world, and has been going on since 1900. That means there’s a lot of data!
Anyone can download the data from the Audubon for personal or non-profit/academic use. However the data isn’t in the friendliest format around. While you can download the data in CSV, PDF, Excel, or XML format, the structure is very difficult to work with.
The data mixes total count records with other data points such as weather, participants, and the number of hours spent counting. Rather than spreadsheets with multiple pages seperating data, or splitting seperate kinds of data into separate files, everything is crammed into one.
This is why I’ve written a parsing tool for using the historical CBC data.
Read more…
You can read more on the GitHub repo, and can even comment or contribute ideas. It currently requires some technical know-how to use, but I aim to improve that over time as the bugs get worked out.