Behind the Scenes of my Learning Journey
π€π‘
Contents
Behind the Scenes of my Learning Journey
π€π‘ΒΆ
This book is an effort for me to document my journey as I learn different technologies and share both the key ideas I found and the mistakes I made - so you donβt have to.
Sections within this bookΒΆ
This book contains the following sections:
1. Functional Pandas in Python workflow πΌπΒΆ
I have been using Pandas and Numpy (the Python libraries) since late 2018, but it is only now that I have to functionize my operations. In this section, I share what I have learned behind the scenes.
2. Jupyter Books explored πΒΆ
Jupyter Book is a flexible tool built on top of Sphinx to
tell data stories, and/or
share code that can allow the reader to replicate experiments, and/or
discuss scientific research
In this section, I literally take you through how I learned to set this Jupyter Book up - it is my first time working with documentation tech in general βΊοΈ
3. Natural Language Processing explored π π’πΊΒΆ
My journey in this space began from a need to identify in-demand skills of marketing researchers, because I wanted to create research that impacts not only marketing operations in a business (marketing was my first qualification) but also leverages tech to ease those operations. That project helped me mine and handle text data and set the stage for my NLP project as a 2021 Delta Analytics fellow.
Machine Learning, Natural Language Processing (NLP) in particular, is affecting our daily lives more and more, as advances in tech are made over time. In this section, I discuss my journey learning NLP.
4. Developing in the cloud βοΈπ₯οΈπ»π±ΒΆ
One of the things that I learned in my internship experience at Adrian Ltd. was that machine learning needs a home that has a nice and clean frontend - Jupyter notebooks were not appreciated as much by people who were not from the AI/ML community. A popular solution to deploy models involved using Flask. A colleague at the time mentioned Django as it was more robust and secure.
When pursuing my studies and facing a slowing life as a result of the pandemic, I saw building a website for the first time as a way to keep busy, learn web development and deployment and share my Masters journey.
Here I share the challenges I faced (especially interacting with Linux for the first time and setting up an SSL certificate) and the lessons I learned along the way.
Future Sections in this bookΒΆ
1. Streamlit rediscovered π§πΎβπ»ΒΆ
I first came across Streamlit in a Data Umbrella webinar, and it really piqued my interest. Fast-forward to my fellowship a couple of months later, I got the opportunity this time to learn it in a coding lab organised by a Delta Analytics facilitator, Brian Spiering.
It was not only enjoyable, but also forced me to get my hands dirty with it (cue an Aha! moment that I learn best in a project-based doing style compared to a webinar listening style).
This came in handy when I got the opportunity to use it as a localised proof of concept screen-shared demo. It really made the demo come alive, and I got to receive great suggestions around my modelling approach like considering a reinforcement learning based approach and optimizing my code for speed.
So in this section, I aim to share what I learned and learn additional information. A key goal I have around Streamlit is learning how to customize the front-end and see if I can incorporate transformers.
2. Graph Databases Relearned πΈοΈποΈΒΆ
I got to interact with graph technology for the first time in a team at Bootcamp 33, and it was pretty awesome. In this section I aim to share strategies that helped me understand this concept the first time round and share my journey in recreating:
how to curate data for a graph
how to set up the schema for a graph (logic behind connections and visualization of the schema)
how to write and run queries for a graph
I aim to share both platform-specific and platform-agnostic solutions around graphs.
3. The Kiva projectΒΆ
In early 2019, I volunteered to mentor in a local initiative to motivate Kenyans to master Data Science called Master Cohorts. I decided to attempt the first assignment, which was based on a past Kaggle competition aiming to solve a problem for Kiva.
One of the highlights around working on this project was that I really stretched my Pandas skills and also got to marry two completely different datasets to create the final notebook.
My Learning PathΒΆ
Long-term Learning Goals |
Short-term Learning Goals |
---|---|
PostgreSQL in Django |
Shared Excel worksheet integration with PowerBI workspace |
MLmodel in Flask with Dash |
PowerBI learning community engagement |
Transformers in Kubeflow Pipelines |
PowerQuery automations |
Data mining (via APIs) & storage in Kubernetes |
\(\quad\) |
Working with Marketing data |
Working with Reporting data |
Desired outcomes:ΒΆ
Create Projects showing technical and business competence
Master glocal data governance working with small decentralized data
Master Data pipelines with seamless integration