Employment¶
Due to the limited geographic granularity of the datasets available to us (through the public facing Statistics Canada web portal and the Odesi data portal through the Ontario Tech University library service), we only have access to a version of the monthly labour force survey data at the major metropolitan areas.
lfs_toronto = labourforce.dataframe(cma="Toronto")
def monthly_employment_by_industry(industry):
if industry is None:
df = lfs_toronto
else:
index = lfs_toronto['naics_21'].str.lower().str.contains(industry)
df = lfs_toronto[index]
df = df[['date', 'lfsstat']] \
.groupby(['date', 'lfsstat']) \
.size() \
.reset_index(name='count') \
.set_index(['date'])
df = df.pivot(columns='lfsstat')
df.columns = [x[1] for x in df.columns.values]
return df
#
# analysis of unemployment numbers
#
def unemployment_rolling_mean(df):
df = df['Unemployed'] / df.sum(axis=1) * 100
df = df.rolling(5).mean().dropna()
return df
#
# plot the pre and post
#
def plot1(df, industry):
pre = df[df.index <= '2020-03-01']
covid = df[(df.index >= '2020-03-01') & (df.index <= '2021-03-01')]
post = df[df.index >= '2021-03-01']
pl.figure(figsize=(10,6))
ax = pl.gca()
pd.Series(pre['2019-10-01'], index=pre.index).plot.line(ax=ax, style='--')
pre.plot.line(ax=ax, color='green')
covid.plot.line(ax=ax, color='red')
pd.Series(post.iloc[-1], index=post.index).plot.line(ax=ax, style='--')
post.plot.line(ax=ax, color='blue')
pl.ylabel('%')
pl.title('Unemployment (%s)' % industry)
Unemployment Numbers¶
df = monthly_employment_by_industry(None)
df = unemployment_rolling_mean(df)
plot1(df, 'All')
df = monthly_employment_by_industry('retail')
df = unemployment_rolling_mean(df)
plot1(df, 'Retail')
Here is a list of the top industries included in the labour force survey.
#
# top industries
#
top_industries = lfs_toronto['naics_21'].value_counts().index[1:].values.tolist()
pd.DataFrame(top_industries, columns=['Top Industries'])
Top Industries | |
---|---|
0 | Professional, scientific and technical services |
1 | Retail trade |
2 | Health care and social assistance |
3 | Finance and insurance |
4 | Educational services |
5 | Construction |
6 | Transportation and warehousing |
7 | Accommodation and food services |
8 | Information, culture and recreation |
9 | Manufacturing - non-durable goods |
10 | Manufacturing - durable goods |
11 | Business, building and other support services |
12 | Wholesale trade |
13 | Other services (except public administration) |
14 | Public administration |
15 | Real estate and rental and leasing |
16 | Utilities |
17 | Agriculture |
18 | Mining, quarrying, and oil and gas extraction |
19 | Forestry and logging and support activities fo... |
20 | Fishing, hunting and trapping |
This is the pre- vs post-COVID lockdown unemployement rates for various industries. The entries are sorted from the most negatively impacted industries to the best recovered industries.
#
# recovery_by_industry
#
df = lfs_toronto.groupby(['date', 'naics_21', 'lfsstat']).size().reset_index(name='count').set_index(['date', 'naics_21'])
df = df.pivot(columns='lfsstat').dropna()
df.columns = [x[1] for x in df.columns.values]
df = df['Unemployed'] / df.sum(axis=1) * 100
df = df.reset_index(level=1).pivot(columns='naics_21').rolling(5).mean().iloc[4:]
df.columns = [x[1] for x in df.columns.values]
df = df.loc[['2019-10-01', '2021-10-01']].transpose()
df.columns = ['pre', 'post']
df['recovery'] = df['pre'] - df['post']
df.sort_values(by='recovery', inplace=True)
df.dropna(inplace=True)
recovery = df['recovery']
df
pre | post | recovery | |
---|---|---|---|
Information, culture and recreation | 3.153645 | 6.204308 | -3.050663 |
Manufacturing - non-durable goods | 2.267684 | 4.645276 | -2.377592 |
Other services (except public administration) | 3.168398 | 4.912130 | -1.743732 |
Business, building and other support services | 4.731357 | 5.928790 | -1.197433 |
Construction | 2.574257 | 3.509540 | -0.935282 |
Health care and social assistance | 1.911522 | 2.488857 | -0.577335 |
Manufacturing - durable goods | 2.531799 | 3.022837 | -0.491038 |
Retail trade | 3.678571 | 4.086203 | -0.407632 |
Transportation and warehousing | 2.493183 | 2.845258 | -0.352075 |
Accommodation and food services | 5.606982 | 5.761452 | -0.154470 |
Educational services | 5.594402 | 5.537084 | 0.057317 |
Professional, scientific and technical services | 2.244297 | 1.962104 | 0.282193 |
Finance and insurance | 2.566738 | 1.639034 | 0.927704 |
color = pd.Series('', index=recovery.index)
color[recovery < 0] = 'red'
color[recovery >=0] = 'blue'
pl.figure(figsize=(10, 5))
recovery.plot.bar(color=color);