A few years back, I inherited a substantial collection of vinyl records that accumulated over two generations and two sides of the family. The result is a very eclectic collection that goes from field recordings of birds of Kenya to early career Johnny Cash and lots of Edith Piaf. I have a simple record player and once in a while over the past few years, I have played some of my favorite records. About a month or so ago, having some free time, I decided to catalog the collection using my favorite tool, TiddlyWiki. In the process, I learned quickly how to leverage some of the functionalities I had previously neglected in TiddlyWiki. I had always used the tagging function but I had not used the metadata fields.
I now have a complete catalog with close to 700 records and this turns out to be an interesting data set for learning purposes because:
a) I can save the TiddlyWiki as a CSV file with all the metadata.
b) I am at the point in my exploration of Python where I can understand basic instructions and use Python in Excel to manipulate the data, including (perhaps) the less structured data (all the album titles and song titles).
I was getting bored with Python exercises that use meaningless data. Hopefully using my own dataset will be a little more exciting. Using the TiddlyWiki Records Catalog will also lead to some data cleaning issues. For example, I'm not sure I can easily extract a list of songs even though I've "captured" all the data in the text section of each record.
At times, when I am struggling to understand what I am doing with Python, I ask myself why I am trying to learn Python in the first place since I don't plan to become a programmer. I am reminded of my long-term goal: I need a better, broad understanding of data management and associated tools to be able to converse intelligently around the technical aspects of knowledge graphs. As a result, I am learning the basics of Protege to understand ontology design, the basics of Python, the basics of Cypher to understand graph querying, etc..
No comments:
Post a Comment