Data analysis dates back to early civilizations, where records of trade and taxes were manually kept. In the 18th and 19th centuries, statistics emerged, with pioneers like Gauss and Nightingale shaping modern analysis. The 20th century saw breakthroughs with relational databases and SQL, revolutionizing data management. By the 1990s, business intelligence and data warehouses enabled deeper insights. Today, data analysis is a critical tool in finance, healthcare, and beyond, shaping the future of technology and business strategies. Thus, general workforce should take actions to upskill in order be a part of this unprecedented era.
I like the skill share projects the most, though I got lost in thousands of them. Skillsharing gives us space to express our ideas and spread the knowledge. Furthermore, they are critical elements in building our online personal brands, while contributing and connecting with the community. In my childhood, I was afar from learning technology. They just did not connect to my soul. While in college and working at American Family Insurance, I gained hands-on experience with IBM Mainframe databases, Apex CRM, Guidewire, Salesforce, and several in-house platforms developed by AmFam. Through this journey, I realized that technology is our most powerful assistant. We spend years in school developing analytical thinking and learning structured decision-making processes�not just to acquire knowledge, but to understand how technology is built, how it works, and how to use it effectively.
I guess I was wandering a bit so lets come back to our story today. Assume that you work for DoorDash and help them figure business insights for a promotion campaign. Datasets are often prepared by data engineers�now it is our show.
This project helped me hone these tips and tricks in Excel:
How much have countries borrowed from the World Bank? Since childhood, I often heard about this organization and its efforts to support global prosperity by providing loans to poorer nations. Today, by exploring a dataset from their website, we can uncover some interesting insights about global debt, repayment, and financial obligations worldwide.
SQL Query:
There were 31 countries which paid more than 50 % of their debts to WorldBank, which was easily seen by scrolling down the result. For practicing purposes, I created another sub-query and result shows the same 31.
We can see that China repaid a huge amount out of their loans, and Macedonia, paid 95%, which meant they are incredibly reliable. Also the one who pays a smallest portion was Korea, at 0%. I wonder if there is any discrepancies in this dataset, but it was downloaded from WorldBank so, maybe in further studies we can dig deeper later.
Query Result:
SQL Query:
Overall, Vietnam repaid to IDA $149 billion, which was 15% of the total owed $1,601 billion. We heard here and there stories like some countries loaned us, e.g., Japan $6 billion to create a highway, but the salary paid to the Japanese engineers were extremely high. So with these numbers, I would further explore data integrity from both sides of aid sources.
Query Result:
SQL Query:
From our query result, we can look at the case of Afghanistan, there are 5004 transactions fully disbursed and around 35% of it, 1854 transactions fully repaid. Up until the published date of this dataset, there were another 3195 transactions in progress of being disbursed. There are 667 repaying transactions. On the other hand, Africa, has around 121 repaid transactions
Query Result:
India has received the highest loan amount. No wonder, over the past 20 years, Indian talent has been widely recognized in computer science, mathematics, and engineering. Some cities have become unexpectedly wealthier due to increased investment in infrastructure and the growth of capitalism. However, the issue with foreign aid to Asian countries is that, despite the large sums provided, regional economic disparities have not significantly improved. In fact, in some cases, they have worsened.
Query Result:
SQL Query:
Query Result:
SQL Query:
Ever curious about how these project used their money, and how did it impacted their community? We need to know the projects' names and hopefully we can retrieve some insights from their local newspapers or reports about them. We can take a look at Ho Chi Minh city, where I grew up from to see if I can spot any familiar names of the transactions, in later queries.
Query Result:
SQL Query:
I recognize our numbers was hard to read from afar so I revise the currency format. I also have some stem plot using SQL, weirdo right but funny.
Query Result:
India received the most loans in the world, while my home country, Vietnam, ranked fifth. Its no surprise that in the 1980s and 1990s, my hometown had no skyscrapers at all. Now, it’s rising as the new dragon of Asia. Google might even establish a headquarters there soon—a source of pride for us. My father, who lived through two wars, deeply understood how difficult it was for Vietnam to join the WTO in the early 2000s. Our government had to overcome countless challenges to prove that we were building an open economy—one where capital didn’t belong solely to the government or any political party, but to the people themselves. Just two months ago, NVIDIA acquired the AI department of Vingroup, integrating it into their company—another milestone for Vietnam. Qualcom just bought a division of Vin-Ai as well. Though the country was once notorious for corruption, significant progress has been made over the past five years to address it, in line with the Party’s top priority: eradicating corruption. Over the past 40 years, we definitely have done some remarkable work from the ground up.
Some date and time format was saved as csv file so when when we analyze them in csvfiddle, we have to convert to sql file type. Similarly, currency format was not ready for at least 4 columns, have to convert them. I corrected the datetime format for these columns "First Repayment Date", "Last Repayment Date", "Last Repayment Date", "Effective Date", "Closed Date".
This case studies help me practice on some begining SQL query such as: GROUP BY, WHERE, Order By, Limit, making up some calculations formula using SQL, number and datetime formatting.
Imagine I am the CFO of Mars Health, looking at the datasets of 1998-2008 patients who had diabetes and were admitted. Based on these datasets, my team are trying to figure out if they were readmitted again with 30 days, which caused extra financial burden on the care facilities, as well as increasing rates of morbidity and fatality for patients as diabetese most of times come with other underlined conditions as well such as cardiovascular, nethology, oronary artery diseases.
SQL Query:
SQL Result:
SQL Query:
SQL Result:
SQL Query:
SQL Result:
SQL Query:
SQL Result:
SQL Query:
SQL Result:
The above query in problem statement #5 is an example of subqueries. Now we can revise the queries as below
In this query, we join the two table: demographics and health, in which patient_nbr is the same column.
We created new table named inner_table, which does not automatically refresh when the 2 tables' data are updated.
Now we take a closer look at CTE by rewriting the materialized view query above as CTE.
Instead of CTE, now we just use Window Function on the above query to see how long does it take for MySQL to process.
Scenario | Best Choice |
---|---|
Need fast repeated queries on aggregated data | âś… Materialized View |
Need a temporary, readable query structure | âś… CTE |
Need row-wise calculations without losing details | âś… Window Function |
Need real-time analytics (data must always be fresh) | âś… CTE or Window Function |
Dataset is huge, and aggregation is expensive | âś… Materialized View |
Not surprisingly, cardiology and nethrology are the two departments that have the highes patient volumes with the most procedures done.