Case Studies

Everyone doing Data Analysis, me too, but WHY ?

Data analysis dates back to early civilizations, where records of trade and taxes were manually kept. In the 18th and 19th centuries, statistics emerged, with pioneers like Gauss and Nightingale shaping modern analysis. The 20th century saw breakthroughs with relational databases and SQL, revolutionizing data management. By the 1990s, business intelligence and data warehouses enabled deeper insights. Today, data analysis is a critical tool in finance, healthcare, and beyond, shaping the future of technology and business strategies. Thus, general workforce should take actions to upskill in order be a part of this unprecedented era.

Project 1: Doordash Analysis using Excel

Project Details

I like the skill share projects the most, though I got lost in thousands of them. Skillsharing gives us space to express our ideas and spread the knowledge. Furthermore, they are critical elements in building our online personal brands, while contributing and connecting with the community. In my childhood, I was afar from learning technology. They just did not connect to my soul. While in college and working at American Family Insurance, I gained hands-on experience with IBM Mainframe databases, Apex CRM, Guidewire, Salesforce, and several in-house platforms developed by AmFam. Through this journey, I realized that technology is our most powerful assistant. We spend years in school developing analytical thinking and learning structured decision-making processes�not just to acquire knowledge, but to understand how technology is built, how it works, and how to use it effectively.

I guess I was wandering a bit so lets come back to our story today. Assume that you work for DoorDash and help them figure business insights for a promotion campaign. Datasets are often prepared by data engineers�now it is our show.

This project helped me hone these tips and tricks in Excel:

  • Ctrl + A: Highlight sheet. Combine with Ctrl + C and paste in a new clean sheet.
  • Use "Define Name" for table ranges to navigate large files.
  • Clean data: remove duplicates, use filters to spot blanks.
  • Data Aggregation and OLAP data slicing ("data dice").
  • Pivot tables for insights: seniors order online more, campaign season trends, family behaviors, etc.
Pivot Summary Scatterplot Graph Data Screenshot Dashboard Overview

Project 3: The One Coding Language Everyone in Data Must Know - SQL

Project Overview

How much have countries borrowed from the World Bank? Since childhood, I often heard about this organization and its efforts to support global prosperity by providing loans to poorer nations. Today, by exploring a dataset from their website, we can uncover some interesting insights about global debt, repayment, and financial obligations worldwide.

1. Which countries paid off their loans?

SQL Query:

Repaying Countries Result Countries pay more than 50% Countries pay more than 50%

There were 31 countries which paid more than 50 % of their debts to WorldBank, which was easily seen by scrolling down the result. For practicing purposes, I created another sub-query and result shows the same 31.

We can see that China repaid a huge amount out of their loans, and Macedonia, paid 95%, which meant they are incredibly reliable. Also the one who pays a smallest portion was Korea, at 0%. I wonder if there is any discrepancies in this dataset, but it was downloaded from WorldBank so, maybe in further studies we can dig deeper later.

Query Result:

SQL Query for Repaying Countries
Min- Repaying Countries
2. How much did Vietnam owe, how has the repayment process been?

SQL Query:

Vietnam Loan Query
Repayment Query

Overall, Vietnam repaid to IDA $149 billion, which was 15% of the total owed $1,601 billion. We heard here and there stories like some countries loaned us, e.g., Japan $6 billion to create a highway, but the salary paid to the Japanese engineers were extremely high. So with these numbers, I would further explore data integrity from both sides of aid sources.

Query Result:

Result of Vietnam’s Loan
Repayment Result
3. Loan Transactions Across Countries

SQL Query:

Transactions by All Countries

From our query result, we can look at the case of Afghanistan, there are 5004 transactions fully disbursed and around 35% of it, 1854 transactions fully repaid. Up until the published date of this dataset, there were another 3195 transactions in progress of being disbursed. There are 667 repaying transactions. On the other hand, Africa, has around 121 repaid transactions

Query Result:

SQL Query for World Loan Transactions
4. Which Country Received the Maximum Loan?
SQL Query:

Most Owed

India has received the highest loan amount. No wonder, over the past 20 years, Indian talent has been widely recognized in computer science, mathematics, and engineering. Some cities have become unexpectedly wealthier due to increased investment in infrastructure and the growth of capitalism. However, the issue with foreign aid to Asian countries is that, despite the large sums provided, regional economic disparities have not significantly improved. In fact, in some cases, they have worsened.

Query Result:

SQL Query for Most Owed Country
5. Credit Status of Countries

SQL Query:

Loan Status Result

Query Result:

SQL Query for Loan Status
6. Loan Disbursements status for Individual Projects

SQL Query:

Project Loan Disbursement

Ever curious about how these project used their money, and how did it impacted their community? We need to know the projects' names and hopefully we can retrieve some insights from their local newspapers or reports about them. We can take a look at Ho Chi Minh city, where I grew up from to see if I can spot any familiar names of the transactions, in later queries.

Query Result:

SQL Query for Project Disbursement
7. Revising currency formats

SQL Query:

Formats Formats

I recognize our numbers was hard to read from afar so I revise the currency format. I also have some stem plot using SQL, weirdo right but funny.

Query Result:

Stem plot SQL Query for Formats

Key Findings

India received the most loans in the world, while my home country, Vietnam, ranked fifth. Its no surprise that in the 1980s and 1990s, my hometown had no skyscrapers at all. Now, it’s rising as the new dragon of Asia. Google might even establish a headquarters there soon—a source of pride for us. My father, who lived through two wars, deeply understood how difficult it was for Vietnam to join the WTO in the early 2000s. Our government had to overcome countless challenges to prove that we were building an open economy—one where capital didn’t belong solely to the government or any political party, but to the people themselves. Just two months ago, NVIDIA acquired the AI department of Vingroup, integrating it into their company—another milestone for Vietnam. Qualcom just bought a division of Vin-Ai as well. Though the country was once notorious for corruption, significant progress has been made over the past five years to address it, in line with the Party’s top priority: eradicating corruption. Over the past 40 years, we definitely have done some remarkable work from the ground up.

Some date and time format was saved as csv file so when when we analyze them in csvfiddle, we have to convert to sql file type. Similarly, currency format was not ready for at least 4 columns, have to convert them. I corrected the datetime format for these columns "First Repayment Date", "Last Repayment Date", "Last Repayment Date", "Effective Date", "Closed Date".

This case studies help me practice on some begining SQL query such as: GROUP BY, WHERE, Order By, Limit, making up some calculations formula using SQL, number and datetime formatting.

Watch the Full Analysis

Project 4: Is My Hospital Doing Well Financially?

Project Details

Imagine I am the CFO of Mars Health, looking at the datasets of 1998-2008 patients who had diabetes and were admitted. Based on these datasets, my team are trying to figure out if they were readmitted again with 30 days, which caused extra financial burden on the care facilities, as well as increasing rates of morbidity and fatality for patients as diabetese most of times come with other underlined conditions as well such as cardiovascular, nethology, oronary artery diseases.

  • Many people I knew never understood why they were not allowed to stay in the hospital for too long. This was the complete opposite of procedures in Asia, where patients can request to stay as long as they want. To be fair, every system has two sides. In Asia, 90% of medical expenses are paid out of pocket by patients. I've heard countless stories of people dying because their families couldn’t afford life-saving procedures. On the other hand, in America, there are often ways to intervene before giving up on a patient. Recently, there have been growing concerns about private insurance companies, highlighting another challenge within the U.S. healthcare system. Ultimately, there is no one-size-fits-all solution in the world.
  • Now lets focus on my role as the CFO of this Mars hospital. I need to see which departments have more than 50 patients who need around 2.5 procedures per stay. One procedure could be EKG, lab work, Xray, MRI, CT; while vital symptom measurements are at no charge. I have conducted 4 queries as of below.
1. Number of Procedures per Departments

SQL Query:

repaying

SQL Result:

repaying
2. Number of patients per department

SQL Query:

repaying

SQL Result:

repaying
3. Departments that have more than 50 patients.

SQL Query:

repaying

SQL Result:

repaying
4. From the above query, we add one more condition of more than 2.5 procedures

SQL Query:

repaying

SQL Result:

repaying
  • Next we look at the difference of number procedures conducted on patients with difference races. The result is that number of procedures are not much a difference based on race.
  • 5.Number of procedure by races

    SQL Query:

    repaying

    SQL Result:

    repaying

    Some further queries such as subqueries, CTE, Window Functions, Materialized Views

    The above query in problem statement #5 is an example of subqueries. Now we can revise the queries as below

    Subquery

    In this query, we join the two table: demographics and health, in which patient_nbr is the same column.

    We created new table named inner_table, which does not automatically refresh when the 2 tables' data are updated.

      âś… Key Benefits
    • âś” Materializing Data: The inner_table stores the results permanently, unlike a temporary view.
    • âś” Performance Optimization: Saves time by avoiding repeated calculations when querying the average lab procedures per race.
    • âś” Can Be Used for Further Analysis: You can now run new queries on inner_table without needing to join health and demographics repeatedly.
    cte

    Now we take a closer look at CTE by rewriting the materialized view query above as CTE.

    window function

    Instead of CTE, now we just use Window Function on the above query to see how long does it take for MySQL to process.

    Scenario Best Choice
    Need fast repeated queries on aggregated data âś… Materialized View
    Need a temporary, readable query structure âś… CTE
    Need row-wise calculations without losing details âś… Window Function
    Need real-time analytics (data must always be fresh) âś… CTE or Window Function
    Dataset is huge, and aggregation is expensive âś… Materialized View

    Key Business Takeaways

    Not surprisingly, cardiology and nethrology are the two departments that have the highes patient volumes with the most procedures done.

    My Insights

  • Examining case studies across various industries, it's clear that data is present in every department. Regardless of our position, we should upskill to dive deeper into internal insights, enhancing efficiency and productivity within the organization—ultimately boosting our job satisfaction.