I spend a lot of time writing, testing and debugging code. Unfortunately most of it that I write these days is for other people, namely: my employers.
The rare times that I do get a few hours free where I can sit and write code for any one of a dozen different ideas I usually spend it writing one of modules that solve a technical hurdle in one project or another.
A few years back I became enamored with MongoDB and the challenging use cases it could address. The idea of throwing large amounts of unstructured data into it and querying it fascinated me and opened up new ways of thinking about how data is stored and queried beyond the comfort of the RDBMS world I knew before.
Without going on a tangent (too late!), I wanted to throw some structured data into MongoDB to mess around with. I grabbed a publicly available city, state, zipcode CSV file. After digging through the Mongo docs I had on hand and searching Google, I discovered MongoDB did not have the native support just to load a comma (CSV) or tab delimited (TSV) text file into a collection like we have in a lot RDBMS platforms.
So I did what any self respecting programmer would do, I wrote one. I wrote up a quick and VERY dirty Python module. It worked well enough for my immediate needs. I assumed that I had to not be the only person needing to load CSV data into MongoDB, so I pushed my code to github.
The module, csv2mongodb, was thrown together and tested all within the space of about two hours. There is a LOT of room for improvement. But for a quick and arguably dirty load of CSV data into a mongodb collection, it should do the trick.
I welcome suggestions for improvement and any constructive criticism.