Monday, September 23, 2024

Back to "School": Learning the Basics of Protege, Python, and Cypher

 A few years back, I inherited a substantial collection of vinyl records that accumulated over two generations and two sides of the family.  The result is a very eclectic collection that goes from field recordings of birds of Kenya to early career Johnny Cash and lots of Edith Piaf. I have a simple record player and once in a while over the past few years, I have played some of my favorite records.  About a month or so ago, having some free time, I decided to catalog the collection using my favorite tool, TiddlyWiki.  In the process, I learned quickly how to leverage some of the functionalities I had previously neglected in TiddlyWiki.  I had always used the tagging function but I had not used the metadata fields.

I now have a complete catalog with close to 700 records and this turns out to be an interesting data set for learning purposes because:  

a) I can save the TiddlyWiki as a CSV file with all the metadata.

b) I am at the point in my exploration of Python where I can understand basic instructions and use Python in Excel to manipulate the data, including (perhaps) the less structured data (all the album titles and song titles). 

I was getting bored with Python exercises that use meaningless data.  Hopefully using my own dataset will be a little more exciting.  Using the TiddlyWiki Records Catalog will also lead to some data cleaning issues.  For example, I'm not sure I can easily extract a list of songs even though I've "captured" all the data in the text section of each record. 

At times, when I am struggling to understand what I am doing with Python, I ask myself why I am trying to learn Python in the first place since I don't plan to become a programmer. I am reminded of my long-term goal: I need a better, broad understanding of data management and associated tools to be able to converse intelligently around the technical aspects of knowledge graphs.  As a result, I am learning the basics of Protege to understand ontology design, the basics of Python, the basics of Cypher to understand graph querying, etc.. 

No comments: