Abstract
COVID-19 has been reported to disproportionately affect minorities in terms of mortality rate1, so it was inquired whether this apparent discrimination also applied to unemployment rate. Through data analysis in Python and K-means clustering, an unsupervised machine learning algorithm based on minimising within-cluster variance to an average point known as a centroid, the bias in unemployment rate trends across race/ethnicities and sex in the U.S. amidst the pandemic was investigated. Data from the U.S. Bureau of Labor Statistics and the U.S. Census Bureau was analysed and K-means clustering was applied on a greater dataset based on the initial findings. From the initial data analysis, it was found that although the Black population entered the pandemic with the greatest unemployment rate, the Hispanic population surpassed them during the pandemic with the most drastic increase in unemployment rate. Unexpectedly, when clustering U.S. counties by race/ethnicities, predominantly white counties had higher increases in unemployment rate than predominantly minority counties. However, when looking at the national population as a whole, predominantly minority counties showed a greater increase in unemployment rate. When channelising the clustering to examine census-designated geographic regions of the US, it was found that in the West, counties with predominantly minority populations saw a higher increase in unemployment rate than white counties. These results may be attributed to discriminatory social norms, pre-existing imbalances in job employment demographics, or the choice of industry. In the future, as the pandemic progresses, more data will be readily available to confirm this paper’s findings and further inquiries. These findings may serve as guidance for government relief policies like the distribution of stimulus checks, and they may help predict how future waves of COVID-19 will influence the economy.
Introduction
COVID-19, a highly-contagious virus2, has dramatically altered the lifestyles of individuals as well as society as a whole. As states have taken preemptive measures to prevent the spread of the virus, many businesses have been forced to close physical locations, declare bankruptcy, and lay off their workers. This has led unemployment rates to peak at unprecedented levels, greater than those of the Great Recession in 20083.
It is essential to recognize the possible disparity in lay-offs caused by COVID-19 so that the government may provide adequate support to all groups. Analysis of the influence of COVID-19 on the economy can serve as an example of how future waves of the virus and other pandemics will affect different demographics. Thus, this paper attempts to answer the central question: Has the COVID-19 pandemic disproportionately affected minorities in terms of unemployment rate? The insights from this study will potentially reveal the systemic racial, socioeconomic, and income inequalities that perpetuates poverty cycles in American society and the world.
To explore demographic bias in unemployment rate, this paper analyses various datasets detailing lay-off rates by race/ethnicity, sex, industry, and/or county in the United States. This paper details the results of the various analyses based on unemployment rates across the continental U.S. counties and the nation as a whole with their significance in terms of social justice.
Literature Review
Although the COVID-19 pandemic is a recent development, it has already been reported that African Americans are disproportionately affected by the pandemic in terms of mortality rate1. For instance, for the 131 predominantly black counties in the U.S., the mortality rate of COVID-19 (6.3/100,000 on average) is reported to be more than five-fold that of predominantly white counties (1.1/100,000 on average) as of early April 20201. In addition, disparities in unemployment rate have persisted even before COVID-19. For instance, by plotting Black unemployment from 1972 to 2019 versus double of the White unemployment rate, a study conducted by American Progress revealed that Black unemployment rates are consistently twice those of the white population4. Certain factors for this include the high incarceration rates for Black Americans and discrimination. An article published by the Center on Budget and Policy Priorities revealed that Black populations have higher unemployment rates and that recession widens these gaps5. Similar to the previous analysis, results show that age and education do not fully account for these disparities. Therefore, a disparity between the unemployment rates of minority groups and those of the White population is expected. However, the extent of this disparity is difficult to predict. Additionally, the Pew Research Center used data from their surveys in both March and April of 2020, for a total sample size of about 15,000 people, to determine demographic disparities. Its conclusions included that Hispanics were hit especially hard, and many did not have enough savings to fall back on, especially for monthly bills. This is why it\’s important for parents to set up an online savings plan for their kids at The Children\’s ISA so you don\’t have to worry about their finances in the future. The research is based on a much smaller sample size than that of the BLS, so its survey data may not be representative of the entire population6. In a PMC study, city surveys and Chi-square tests discovered substantial evidence of social and structural discrimination against the black and latino population as well as other marginialized demographics amidst COVID-19 in Chicago. As the study’s conclusions were specific to Chicago, it left much to be discovered regarding areas beyond Illinois and rural areas7.
Methods
Methodology Overview and Compiling of Datasets.
This paper considers the four racial/ethnic groups given by the U.S. Bureau of Labor Statistics (BLS): White (defined as non-Hispanic White), Latino/Hispanic, Black/African American, and Asian. BLS defines an unemployed person as someone who does not have a job, but has actively looked for work in the prior 4 weeks and is available to work at the time of the survey8. Unemployment rate is defined as “the number of unemployed people as a percentage of the labor force (the labor force is the sum of the employed and unemployed)”. Point increase is defined as the arithmetic difference between two percentages, while percentage increase is defined as the difference between two numbers divided by the original number and multiplied by 100 percent.
(Equation 1)
The main source for data was the BLS, which is comprised of numerous comprehensive datasets, including one regarding national unemployment rates separated by industry, race/ethnicity, or sex. The tabular data was cleaned by downloading these datasets and importing into either Microsoft Excel or Google Sheets and then manually removing unnecessary text and converting non-numeric entries to numeric data for future parsing. The dataset was converted to a comma-separated value (CSV) file by exporting via built-in Excel functionality to analyze in Python and Jupyter Notebook. Here, several Python3 modules including Pandas, Matplotlib, Numpy, and Scikit-learn were utilised to search for relationships between unemployment rates during COVID-19 and certain demographic groups.
Excluding K-means, the other methods of analysis only required cleaning and graphing the data, indicating a low possibility of error. However, error is possible in the discussion and handling of these results. Correlation between certain trends may not be indicative of a causational relationship. For instance, the percentage of each race or sex in each occupation or field may not have played a significant role in the relationship between the unemployment rates of each race and sex.
A. Increase in Unemployment Rate by Race and Sex. Using 2020 data from the BLS9, the paper analysed the unemployment rate of each sex by race/ethnicity as recorded in June of 2019 and June of 2020, which corresponds to before and during the COVID-19 pandemic. The unemployment rate by sex was visualized on a quadruple bar graph in which the x-axis displays the races/ethnicities (White, Black, Hispanic, and Asian) and the y-axis displays the unemployment rate. We then calculated the difference in unemployment rate before and during the pandemic for the two sexes in each race/ethnicity to determine whether there was disproportion ate impact on a particular sex. Figure 1 compares national unemployment for race/ethnicity and sex before and during the COVID-19 pandemic.
B. Increase in Unemployment Rate by Race and Occupation. This research also analysed the difference in unemployment rate between July 2019 and July 2020 for the two sexes in the five BLS occupations10. For each occupation, the difference in unemployment rate was calculated by subtracting the unemployment rate in July 2020 minus the unemployment rate in July 2019. This was plotted onto a bar graph in which the x-axis is the occupation and y-axis is unemployment rate.
C. Increase in Unemployment Rate by Sex for Industry and Occupation. The difference in unemployment rate between July 2019 and July 2020 was analysed for the two sexes11. The five occupations analysed were management, professional, and related occupations; service occupations; sales and office occupations; natural resources, construction, and maintenance occupations; and production, transportation, and material moving occupations. For each occupation, the difference in unemployment rate was calculated by subtracting the unemployment rate in July 2020 minus the unemployment rate in July 2019. This was plotted onto a bar graph where the x-axis is the occupation and y axis is unemployment rate.
D. Linear Regression. The research also includes a multivariable linear regression to analyse the unemployment rate of differ ent races/ethnicities over time based on a BLS dataset12. Indicator variables were first created for each of the categorical data of each race/ethnicity’s unemployment rate, and a variable for the month (e.g. April = 4). The dataset included the unemployment rate for the White, Black, Hispanic, and Asian population in April, May, and June of 2020. The dataset depicts the unemployment rate spike in April, and the slow decline in the following months of May and June. This dataset was analysed using the ordinary least squares (OLS) regression from the statsmodels API. OLS falls under multivariable linear regression, a way to show the linear relationship between many different variables. To find the best linear model, the program will calculate the squared difference between the data points, and the line that is predicted, then try to minimise this sum, thus producing a line with the least error. The results of a linear regression include a constant as well as coefficients corresponding to each factor which constitute a linear equation:
(Equation 2)
Each coefficient shows how much the corresponding factor contributes to the line; therefore, it can be inferred that a larger coefficient means that the corresponding race/ethnicity has a higher unemployment rate. Furthermore, the linear regression outputs measures of statistical significance, such as the P value and the R2 value. A P value of less than 0.05 indicates that the data is statistically significant. R2 value is between 0 and 1; the closer the value is to 1, the more accurate the model is.
E. K-means Clustering. K-means clustering is an unsupervised machine learning algorithm which aims to partition n observations into k clusters. A cluster refers to a collection of data points aggregated together because of certain similarities. The hyperparameter k defines the target number of clusters, with each cluster containing a centroid—the center of the cluster. The K-means algorithm then allocates all n observations to their nearest clusters by minimizing within cluster variance (i.e. the straight line distance between the point and its centroid). This iterative algorithm can be applied to clusters in multiple dimensions; 3 dimensional clustering was utilised in this study. K-means clustering was chosen as opposed to other regression types because K-means is able to correctly fit the data and also has the quality of being able to cluster in high-dimensions. However, K-means, when not using correct hyperparameters, can often overfit or underfit the data, leading to a possible source of error. Overfitting the data means that the machine learning algorithm merely memorises the training data, resulting in better training accuracy than testing accuracy. This leads to patterns that are too specific and inapplicable to a larger, real set of data. The contrary, underfitting, is ineffective in revealing possible relationships between the variables that are being researched. To minimise this source of error, the silhouette method (discussed below) was used to find the optimal number of clusters.
E.1. Finding the Optimal Number of Clusters. To find the optimal number of clusters, which would become the algorithm’s hyperparameter, the silhouette method and the elbow method were utilised.
The silhouette method outputs a silhouette number which is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation). The silhouette method produces a value from -1 to +1, where a value closer to +1 demonstrates that the object is more similar to its cluster compared to other neighboring clusters. More objects having higher silhouettes demonstrate that the number of clusters is appropriate, while lower numbers may demonstrate that the number of clusters may be too few or too many, The number of clusters were plotted on the x-axis x-axis and the silhouette number on the y-axis. Numbers of clusters with above average (0.50+) silhouette scores are possible candidates for the optimal number of clusters. The elbow method analyses the total within-cluster sum of squares (WSS). The total WSS measures the compactness of a cluster so this number should be as low as possible. Using the sklearn library in Python, the WSS was calculated for different values of k (the number of clusters) ranging from 1 to 10. Then, the WSS was plotted as a function of k and observed the k value for which there was a visible bend (\”elbow\”) within the graph.
E.2. Clustering by Race/Ethnicity: Unweighted and Weighted by Population Size. Data from the U.S. Census Bureau was obtained including the racial composition of each of the 3108 counties (or equivalent) on continental United States (excluding Hawaii and Alaska due to lack of county reporting)13 and data from the BLS of the unemployment rate of April 2019 and April 2020, before and during the pandemic 14. The racial composition data was used to find the percent of each county’s population that is non-Hispanic White and the percent that is minority, calculated by summing the Black, Asian, Hispanic, and Native American percentages for that county. The unemployment data of each U.S. county was used to calculate the increase in unemployment rate by taking the difference between their 2020 and 2019 rates, and used a K-means clustering algorithm, passing a hyperparameter of 3 (the output from the silhouette and elbow methods), to identify clusters in the data. Every county was weighted equally by the K-means algorithm (populations were not taken into account). By analysing the centroids of the clusters, which represent the average unemployment rate for that cluster and its racial composition, conclusions could be drawn about the correlation between the racial composition of the county (i.e. percent non-Hispanic white and minority) with its unemployment rate, allowing us to find if counties with a larger percent of a certain race/ethnicity were affected more in terms of unemployment by the economic crisis caused by the pandemic. Figure 5, Figure 6, and Figure 7 depict the clustering algorithm and its results.
The analysis was extended by taking into account the populations of each county in the K-means algorithm. The county with the smallest population, relative to the rest of the state, was averaged and assigned as the base population for every state (calculated to be 12,406). A multiplier was then calculated for each county based on how many times greater its population was to the base population. Each county’s data was duplicated and added to the end of the dataset as many times as that county’s multiplier. The procedure for calculating racial composition and change in unemployment rate for each county was the same as the unweighted clustering approach detailed previously. Finally, the K-means clustering algorithm was applied with a hyper parameter of 3 to determine the centroids and whether certain groups had higher increases in unemployment rate when weighted by population size.
E.3. Clustering by Geographic Region. To further investigate the unemployment rate change in the US during COVID 19, counties were clustered by geographic region as designated by the Census Bureau: West (Arizona, California, Colorado, Idaho, Montana, Nevada, New Mexico, Oregon, Utah, Washington, Wyoming); Midwest (Illinois, Indiana, Iowa, Kentucky, Michigan, Minnesota, Missouri, Nebraska, North Dakota, Ohio, South Dakota, Wisconsin); Northeast (Connecticut, Maine, Massachusetts, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, Vermont); South (Alabama, Arkansas, Delaware, District of Columbia, Florida, Georgia, Kentucky, Louisiana, Maryland, Mississippi, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia, West Virginia). The same dataset from the previously used census website13 and BLS website was used in clustering by race/ethnicity. However, data was queried for only the states in the geographic region in question. The hyperparameter of 3 (the optimal number of clusters as outputted by the elbow and silhouette methods) was inputted into the K-means algorithm and analysed the clusters.
E.4. Statistical Distinction. Finally, in order to validate the hyperparameter and ensure that the clusters generated by the K-means algorithm were statistically distinct, a P-test was performed. To ensure that the clusters were statistically distinct in all its characteristics, the data was in a three dimensional cluster. The Python statistics library was utilised to calculate standard deviation and then divided the result by the square root of the size of the population sample multiplied by two as the 95% confidence interval is two standard deviations over and then added and subtracted from the cluster’s mean. If the means are not overlapping, it can be concluded with 95% certainty that they are statistically distinct as they have a P value of less than 0.05. The formula to find the confidence interval is shown below, in which z is the confidence interval, n is the sample/cluster size, x is the sample/cluster mean, and x is each county’s data point. This must be repeated three times for every dimension as there are three clusters.
(Equation 3)
Results
A. Increase in Unemployment Rate by Race and Sex. Between June 2019 and June 2020, Figure 1 shows that women experienced greater increases in unemployment rate than men. Similarly, minority groups experienced greater increases in unemployment rate than the White population. Women had higher increases in unemployment rate regardless of race/ethnicity. The greatest disparity between the two sexes’ increase in unemployment rate from 2019 to 2020 occurred within the Hispanic population (+2.9 points), followed by the Asian population (+2.8 points), the White population with a (+2.4 points), and lastly, the Black population (+0.9 points). Among all demographics, Hispanic women had the greatest increase in unemployment (14.4 points) while White men had the lowest increase in unemployment (7.8 points).
Figure 1. Quadruple bar graph and its corresponding data table comparing unemployment rate from June 2019 to June 2020 between men and women of different races/ethnicities.
B. Increase in Unemployment Rate by Race and Occupation. Figure 2 depicts that between June 2019 and June 2020, service occupations had the highest increase in unemployment rate across all races/ethnicities, and each occupation seemed to favor a certain racial group. All occupations experienced a distinct increase in unemployment rate for all races/ethnicities except the Hispanic and Black populations in management and professional occupations, indicating that management occupations were least affected by the pandemic. Conversely, Asian populations saw the highest percent increase in unemployment rate in management and professional occupations (19.48%) as well as in production and transportation occupations (14.22%). Despite their success in management occupations, the Hispanic population saw the highest percent increase in unemployment rate in service occupations (54.05%) as well as construction and maintenance occupations (34.59%). The Black population led in unemployment rate in sales and office occupations (35.08%). It is important to note that the White population did not lead in the percent increase in unemployment rate for any of the occupations considered.
Figure 2. A bar graph and its corresponding data table comparing the change in unemployment rate from June 2019 to June 2020 for the five BLS occupations by race/ethnicity.
C. Increase in Unemployment Rate by Sex and Occupation. As depicted in Figure 3, from July 2019 to July 2020 women led in unemployment rate increases in all occupations except for those related to maintenance (including construction). Service occupations experienced the greatest increase in unemployment rate overall for both men (11.1 points) and women (12.2 points).
Figure 3. A bar graph and its corresponding data table depicting the unemployment rate change from June 2019 to June 2020 for the men and women across the five BLS occupations.
D. Multivariable Linear Regression by Race/Ethnicity . As shown in Figure 4, each categories’ coefficients are significant, since all the p values are less than 0.05. It is interesting to note that while the Black and Hispanic populations’ coefficients had a 0.7 difference, there was a difference of 1.9 between the coefficients of the Black and Asian population, and 2.2 difference between the Asian and White population.
Figure 4. The summary of the results from the linear regression on race/ethnicity by unemployment rate.
E. K-means Clustering.
E.1. Clustering by Race/Ethnicity: Unweighted and Weighted by Population. From the clustering in Figure 5, a correlation between the percent White and the point increase in unemployment rate was determined. The centroid with the highest White percentage (89.6%) had the highest increase in unemployment rate (9.19 point increase). As White percentage fell so did the point increase. The centroid with the second highest White percentage (65.1%) saw a point increase in unemployment rate of 8.30, and the centroid with the smallest White percentage (35.12%) saw the lowest point increase in unemployment rate of 8.09.
Figure 5. 3D scatter plot and its corresponding data table depicting racial composition and point increase in unemployment rate of each U.S. county. Points of the same color are clustered under the same centroid.
From the clustering in Figure 6, a correlation between the percent minority and the point increase in unemployment by race/ethnicity was determined. The centroid with the highest minority percentage (68%) had the highest increase in unemployment rate (11.6 point increase). The centroid with the second highest minority percentage (38.7%) had the smallest point increase of 10.3. The centroid with the smallest minority percentage (14.1%) and had a point increase of 10.7.
Figure 6. 3D scatter plot and its corresponding data table depicting racial composition and point increase in unemployment rate of each U.S. county and weighted by population size. Points with a darker outline are weighted more (easily visible in the green cluster). Points of the same color are clustered under the same centroid.
E.2. Clustering by Geographic Region: West. From the clustering in Figure 7, a centroid with a predominantly minority population had the highest increase in unemployment rate (8.5 point increase) was determined while the centroids with a majority white population experienced a roughly 8 point increase in unemployment rate, a significant 0.5 points lower than the minority centroid.
Figure 7. 3D scatter plot and its corresponding data table depicting racial composition and point increase in unemployment rate of each U.S. county in the Western region. Points of the same color are clustered under the same centroid.
E.3. Clustering by Geographic Region: Midwest. From the clustering in Figure 8, it was determined that the centroids with a majority white population (cluster numbers 1 and 3) experienced the lower point unemployment increase compared to the majority-minority centroid (cluster number 2). Upon analysis of the scatter plot, it was revealed that most counties in the Midwest are at least 60% white.
Figure 8. 3D scatter plot and its corresponding data table depicting racial composition and point increase in unemployment rate of each U.S. county in the Midwestern region.
E.4. Clustering by Geographic Region: Northeast. From the clustering in Figure 9, the research shows there is no clear correlation between racial composition of the centroids and point increase in unemployment rate. While the centroid with a 67% white population (cluster number 3) did have the highest increase in unemployment rate at +12 points, the centroid with a 90% white population (cluster number 1) actually had a lower increase in unemployment. The majority minority centroid (cluster number 2) was numerically in between the two white centroids in terms of unemployment increase. The research also revealed that as a whole, the New England region experienced the highest increase in unemployment (+11-12%) compared to any other region in the US, which strongly implies that this region was the hardest hit.
Figure 9. 3D scatter plot and its corresponding data table depicting racial composition and point increase in unemployment rate of each U.S. county in the Northeastern region.
E.5. Clustering by Geographic Region: South. From the clustering in Figure 10, it was shown that there is no clear correlation between racial composition of the centroids and point increase in unemployment rate. While the centroid with a 87% white population (cluster number 1) did have the highest increase in unemployment rate at +9.75 points, the centroid with a majority-minority population (cluster number 2) actually had a lower increase in unemployment.
Figure 10. 3D scatter plot and its corresponding data table depicting racial composition and point increase in unemployment rate of each U.S. county in the Northeastern region.
Discussion
A. Increase in Unemployment Rate by Race/Ethnicity and Sex. Hispanic women were affected the most compared to other groups in terms of point increase in unemployment rate. It is also clear that the Black population began with the highest unemployment rate, but was surpassed by the Hispanic population, showing the extent to which the Hispanic population was affected. In addition, for every race/ethnicity, the research showed that women in general were affected more with regards to unemployment compared to men. There are several possible explanations for this disparity. Sexism in the workplace may have contributed to discrimination against women during this time. Employers may have laid off women at higher rates than men purely based on existing prejudices. Sex-plus age discrimination may have led employers to lay off older women at higher rates, bypassing the law against discrimination in the workplace15 . Another possible explanation is that women are more likely to change their schedules and career goals to accommodate family life than men16. Therefore, as caregiving services have closed down and children are staying home, women may be forced to accommodate them. This in turn may have a negative impact on their performance at work, ultimately resulting in the loss of their job. These results are significant as they demonstrate how the existing inequality among the sexes are exacerbated by the pandemic.
B. Increase in Unemployment Rate by Race/Ethnicity and Occupation. Service occupations experienced the highest increase in unemployment across the board. According to the BLS, service occupations include jobs in the food and beverage, hotel, aviation, and other industries17. These industries were heaviest hit by the pandemic as; for example, nearly 60% of restaurant closures during COVID-19 are permanent18 and airline companies’ business dropped to an all time low19, which explains the high unemployment in service industries from the results. Conversely, management and professional occupations experienced the overall lowest percent increase in unemployment, with Hispanics, the group that was the most affected. even seeing an increase in employment rate. According to the BLS, employees in management and professional occupations had the most opportunities for teleworking (working from home) with over 54% of employees in the occupation working remotely during the pandemic. Service occupations, on the other hand, had one of the least opportunities for tele working with only about 7% of employees working remotely during the pandemic20. Because business in the services industry can no longer keep physical locations open (as they are not classified as essential) and cannot offer a teleworking option to their employees, those working in this occupation are likely to be laid off. This, however, is not the case in management jobs due to their high teleworking potential. Teleworking potential may play a major role in determining whether an occupation will experience sharp increases in unemployment rate.
C. Increase in Unemployment Rate by Sex and Occupation. As stated previously, service occupations likely had the highest increase in unemployment rate since many service jobs are non-essential, without teleworking potential. The fact that maintenance occupations were the only occupations in which men had a greater increase in unemployment rate may be attributed to the fact that 94.6% of maintenance workers are male21. Therefore, a larger number of men may have been fired simply because they make up an overwhelming majority of that industry.
D. Multivariable Linear Regression by Race/Ethnicity. Month was the only category that had a negative coefficient, due to the fact that this data was taken starting from April, where unemployment rate spiked, to June. Because of this, the highest unemployment rate was often in April, and as each month passed, the unemployment rate decreased by around 1.3.
Through the findings, the research revealed that the Hispanic ethnicity had the highest increase in unemployment rate during the COVID-19 pandemic as it had the largest coefficient. This corroborates the previous findings about the racial disparity with Hispanics being affected the most with regards to unemployment rate both by gender and occupation. The Black population had the next highest increase in unemployment rate. The gaps between the coefficients of different minority groups shows that although the Asian population is also a minority, they were not affected as strongly as the other two minority populations (also consistent with the previous findings). However, the Asian population still greatly contrasts with the White population in terms of unemployment rate. This may be due to the difference in occupations that each of the races/ethnicities predominately work in. The Asian population makes up 14.1% of service occupations, the hardest hit occupation by the pandemic, while other minorities, Black and Hispanic, made up about 20-22% of the occupation22. Hispanics also made up the largest percent of the sales occupation, the second hardest hit occupation. As the other races/ethnicities made up more of these non-essential jobs, they were more likely to be unemployed. The White population, on the other hand, made up only 13.4% of service occupations22, and made up more of the management and professional occupation, which was not hit as hard22 , allowing many of the White population to remain employed. Similarly, Asians made up 44.6% of the management, professional, and related occupations, and therefore were not unemployed as much as the other races/ethnicities.
E. K-means Clustering.
E.1. Clustering by Race/Ethnicities: Unweighted and Weighted by Population. The results from the unweighted clustering seem to contradict the previous analysis on national data that indicates that minorities had higher unemployment rates due to COVID-19. One possible explanation is that minority populations are concentrated within specific counties, whereas rural areas, which tend to have predominantly white populations, make up 97% of the U.S.23. Therefore, in the unweighted clustering, a majority of data points showed unemployment rates more reflective of the white population, with little to no weight given to large minority populations within a single county. For instance, a county with a 15,000 population and 60% minority, with a low unemployment rate would make it seem like high minority counties have lower unemployment rates. However, this is not reflective of the overall population since larger counties with large minority populations (which is less common) and high unemployment rates show that minority populations generally have higher unemployment rates. Therefore, when weighting by population these differences are accounted for, providing a more accurate representation of the disparities that exist. However, more analysis must be done to determine whether this is the definite reason.
E.2. Clustering by Geographic Region. As counties were clustered in the US based on the geographic region they are in, the research revealed that the West was the only region for which majority minority counties experienced a higher point increase in unemployment rate than majority white counties. This could be attributed to the fact that the West has a sizable minority population relative to the rest of the country. As of 2017, there are only six majority-minority states in the US: California, Nevada, New Mexico, Hawaii, Texas, and the District of Columbia24. Of these six, four are in the Western region. The effect of COVID-19 on specifically Hispanic and Black unemployment rate was especially profound due to their high workforce compositions in service, construction, and production occupations relative to their white counterparts. These occupations, as figure 2 depicts, were the hardest hit.
Moving beyond the West into the Midwestern region, the results are more consistent with the national analysis: majority white counties saw a higher increase in unemployment rate than majority-minority counties. The Midwest is known as the \”Land of Farms\’\’ which underscores the extent to which the region’s economy is dominated by agriculture. The Midwest also has a sizable manufacturing industry25 . With the closure of many domestic and foreign markets for agricultural exports and a severe labor shortage, the relatively concentrated Midwestern economy has been hard hit. Soybeans, an over $140 billion industry, were the top agricultural product for most Midwestern states26, and the closure of meat processing plants during the pandemic has caused soy farmers to lose their number 1 customer domestically27. In addition to agriculture, the situation plaguing the majority white Midwestern manufacturing industry may explain the high unemployment rate. Many Midwestern manufacturing plants had to close their doors due to employee safety concerns and compliance with new state regulations28.
It is important to note that while there was little discernible correlation between the percent composition of a county and unemployment rate, the Northeast region as a whole saw the highest increase in unemployment rate during the pandemic than any other part of the US.
Conclusion
COVID-19 has raised great challenges for the American people in terms of their livelihoods and daily lives, and this study reveals a significant demographic disparity. Overall, the research on the national population indicates that despite the Black population having a higher unemployment rate prior to the pandemic, the Hispanic population experienced the greatest increase in unemployment rates throughout the pandemic. In county level data, however, predominantly White counties showed a greater increase in unemployment rate, possibly due to underrepresentation of minorities. Considering this possibility, these findings have profound implications for relief policies. When allocating funds, the government may want to consider both county and minority representation. The government can work towards providing aid to these people and their families in order to overcome this health crisis. Furthermore, the findings on a regional level may serve as a guideline for federal and state governments to effectively allocate stimulus checks and relief funding to the hardest hit industries and geographic regions.
Acknowledgements
We would like to thank ASDRP for providing us with the opportunity to conduct research remotely. Additionally, we owe our advisor Dr. Phil Mui immense gratitude for his continued support throughout our scientific career as well as the time he has dedicated in teaching and assisting us throughout our endeavors. When we encountered issues, Dr. Mui took time out of his day in order to advise us on how to proceed, and his enthusiasm for our research project has never faltered.
References
- “Effects of the Coronavirus COVID-19 Pandemic (CPS).” 2020. Bls.Gov. September 23, 2020. https://www.bls.gov/cps/effects-of-the-coronavirus-covid-19-pandemic.htm.
- “Can People in the U.S. Get COVID-19? How Does COVID-19 Spread? What Are the Symptoms of COVID-19? What Are Severe Complications from This Virus? How Can I Help Protect Myself?” 2020. https://www.cdc.gov/coronavirus/2019-ncov/downloads/2019-ncov-factsheet.pdf.’
- Kochhar, Rakesh. “Unemployment Rose Higher in Three Months of COVID-19 than It Did in Two Years of the Great Recession.” Pew Research Center. Pew Research Center, June 11, 2020. https://www.pewresearch.org/fact-tank/2020/06/11/unemployment-rose-higher-in-three-months-of-covid-19-than-it-did-in-two-years-of-the-great-recession/.
- Ajilore, Olugbenga. “On the Persistence of the Black-White Unemployment Gap.” Center for American Progress. Accessed October 25, 2020. https://www.americanprogress.org/issues/economy/reports/2020/02/24/480743/persistence-black-white-unemployment-gap/.
- “Robust Unemployment Insurance, Other Relief Needed to Mitigate Racial and Ethnic Unemployment Disparities.” Center on Budget and Policy Priorities, August 6, 2020. https://www.cbpp.org/research/economy/robust-unemployment-insurance-other-relief-needed-to-mitigate-racial-and-ethnic.
- Mark Hugo Lopez, Lee Rainie, and Abby Budiman. “Financial and Health Impacts of COVID-19 Vary Widely by Race and Ethnicity.” Pew Research Center. Pew Research Center, May 5, 2020. https://www.pewresearch.org/fact-tank/2020/05/05/financial-and-health-impacts-of-covid-19-vary-widely-by-race-and-ethnicity/.
- Ruprecht, Megan M, Xinzi Wang, Amy K Johnson, Jiayi Xu, Dylan Felt, Siobhan Ihenacho, Patrick Stonehouse, et al. “Evidence of Social and Structural COVID-19 Disparities by Sexual Orientation, Gender Identity, and Race/Ethnicity in an Urban Environment.” Journal of Urban Health, December 1, 2020. https://doi.org/10.1007/s11524-020-00497-9.
- “Concepts and Definitions (CPS).” Bls.gov, February 3, 2020. https://www.bls.gov/cps/definitions.htm#unemployed.
- “One-Screen Data Search.” 2018. Bls.Gov. 2018. https://data.bls.gov/PDQWeb/la.
- “HOUSEHOLD DATA NOT SEASONALLY ADJUSTED A-20. Employed Persons by Occupation, Race, Hispanic or Latino Ethnicity, and Sex [Percent Distribution].” n.d. Accessed October 24, 2020. https://www.bls.gov/web/empsit/cpseea20.pdf.
- “A-30. Unemployed Persons by Occupation and Sex.” 2020. Bls.Gov. October 2, 2020. https://www.bls.gov/web/empsit/cpseea30.htm.
- “Employed Persons by Occupation, Race, Hispanic or Latino Ethnicity, and Sex.” 2020. Bls.Gov. January 22, 2020. https://www.bls.gov/cps/cpsaat10.htm.
- US Census Bureau. 2020. “2019 Population Estimates by Age, Sex, Race and Hispanic Origin.” The United States Census Bureau. June 25, 2020. https://www.census.gov/newsroom/press-kits/2020/population-estimates-detailed.html.
- “Civilian Unemployment Rate.” 2010. Bls.Gov. 2010. https://www.bls.gov/charts/employment-situation/civilian-unemployment-rate.htm.
- Bachman, Eric. 2020. “Older Female Employees Face Double Jeopardy During Covid-19 Layoffs.” Forbes, June 1, 2020. https://www.forbes.com/sites/ericbachman/2020/06/01/older-female-employees-face-double-jeopardy-during-covid-19-layoffs/#27ef11ba59c8.
- Parker, Kim. “Women More than Men Adjust Their Careers for Family Life.” Pew Research Center. Pew Research Center, August 14, 2020. https://www.pewresearch.org/fact-tank/2015/10/01/women-more-than-men-adjust-their-careers-for-family-life/.
- “Occupational Definitions – Service Occupations.” 2002. Bls.Gov. June 27, 2002. https://stats.bls.gov/oes/1998/oes_def6.htm.
- Croft, Jay. 2020. “Yikes! Yelp Says 60% of Restaurant Covid-19 Closures Are Permanent.” CNN. July 25, 2020. https://www.cnn.com/2020/07/25/business/restaurants-reopen-coronavirus-shutdown-trnd/index.html.
- Slotnick, David. 2020. “Airline Industry Will Not Recover from Coronavirus for Years: Moody’s – Business Insider.” Business Insider. Business Insider. July 19, 2020. https://www.businessinsider.com/airline-aviation-industry-recovery-coronavirus-forecast-moodys-2020-7.
- Terrell, Kenneth. 2020. “10 Occupations Hit Hardest by Coronavirus Pandemic.” AARP. July 2020. https://www.aarp.org/work/job-search/info-2020/coronavirus-occupation-job-loss.html.
- “Natural Resources, Construction, & Maintenance Occupations | Data USA.” 2016. Datausa.Io. 2016. https://datausa.io/profile/soc/natural-resources-construction-maintenance-occupations.
- “Sales Representatives, Services, All Other.” Data USA. https://datausa.io/profile/soc/sales-representatives-services-all-other.
- “Rurality in the United States.” ruralhome.org, 2020. http://www.ruralhome.org/storage/research_notes/Rural_Research_Note_Rurality_web.pdf.
- “A State-by-State Look at Growing Minority Populations.” Governing. https://www.governing.com/topics/urban/gov-majority-minority-populations-in-states.html.
- “Midwest Economy – Federal Reserve Bank of Chicago.” 2020. Chicagofed.Org. 2020. https://www.chicagofed.org/region/midwest-economy/midwest-economy.
- “Biggest Agricultural Exports in Every State.” 2019. Stacker. 2019. https://stacker.com/stories/3731/biggest-agricultural-exports-every-state?page=3.
- “The Ripple Effect of COVID-19 for Soybean Farmers, Demand, and Transportation.” 2020. Successful Farming. April 30, 2020. https://www.agriculture.com/news/the-ripple-effect-of-covid-19-for-soybean-farmers-demand-and-transportation.
- Ciechalski, Suzanne, Lisa Riordan Seville, and Emily Siegel. 2020. “Midwest Manufacturing Workers Sound Alarm over COVID-19 Outbreaks.” NBC News. NBC News. May 16, 2020. https://www.nbcnews.com/news/us-news/midwest-manufacturing-workers-sound-alarm-over-covid-19-outbreaks-n1207391.
About the Authors
Aditya is a junior at Washington High School in Fremont, California. An avid outdoorsman, he enjoys camping and hiking in his free time and hopes to one day scale Mt. Whitney. He also enjoys playing badminton and watching football with his dad. Aditya wishes to go into finance when he grows up and start his own global initiative in the future to provide water and supplies to impoverished regions of the globe. He enjoys coding when applied to solve real world problems, particularly social justice, which may ultimately help our society become more fair and equitable.
Maithili is a junior at American High in Fremont, California. She hopes to go into the field of Computer Science or engineering. In her free time, she enjoys playing badminton with her family and friends as well as learning about different fields of engineering. Whether it’s through volunteer working or pursuing scientific endeavors, she hopes to use her scientific knowledge and skills to contribute to and better the community.
Justin Lin is a Junior at Henry M. Gunn High School in Palo Alto, California. When not attending school, Justin loves to spend his time learning about computer science and participating in competitive programming competitions. Additionally, Justin has spent several years devoted to wrestling, competing for both club and school teams. He hopes to pursue a career dedicated to innovating transformative technology and helping those who have disabilities.
Angelina Loh is a Junior at American High School in Fremont, California. She is an aspiring Computer Science major and in particular, loves coding in Python. Her hobbies include digital art, traditional painting, and playing Nintendo games. In the future, she hopes to further advance her coding skills and explore more into the field of artificial intelligence and machine learning.