New update of the recent second wave (or third?) of COVID-19 in Indonesia

Last updated on Jul 28, 2021 6 min read coronavirus COVID-19, python

It has been a while since i last wrote about COVID-19. Today i’d like to check out Indonesia’s statistic on COVID-19, especially since it seem to get worse these days, unfortunately, and so many people talked about possibility of the government intentionally undertest to push down new cases at the cost of human lives.

Penambahan kasus Covid-19 harian cenderung menurun. Hal ini terjadi seiring dengan turunnya jumlah pemeriksaan secara signifikan. Masih terlalu dini untuk menyimpulkan bahwa gelombang Covid-19 telah terkendali. #Humaniora #AdadiKompas @aik_arif https://t.co/eYvloMmpIC
— Harian Kompas (@hariankompas) July 21, 2021

I rely heavily on Our World in Data ¹ which give free access of COVID-19 data.

Grab the data and show 6 tops

url='https://covid.ourworldindata.org/data/owid-covid-data.csv' # simpan url
df=pd.read_csv(url, parse_dates=['date']) # download dari url. parse_dates untuk menjadikan kolom date jadi tipe waktu
df.head(6) # menampilkan 10 baris paling atas

	iso_code	continent	location	date	total_cases	new_cases	new_cases_smoothed	total_deaths	new_deaths	new_deaths_smoothed	...	extreme_poverty	cardiovasc_death_rate	diabetes_prevalence	female_smokers	male_smokers	handwashing_facilities	hospital_beds_per_thousand	life_expectancy	human_development_index	excess_mortality
0	AFG	Asia	Afghanistan	2020-02-24	1.0	1.0	NaN	NaN	NaN	NaN	...	NaN	597.029	9.59	NaN	NaN	37.746	0.5	64.83	0.511	NaN
1	AFG	Asia	Afghanistan	2020-02-25	1.0	0.0	NaN	NaN	NaN	NaN	...	NaN	597.029	9.59	NaN	NaN	37.746	0.5	64.83	0.511	NaN
2	AFG	Asia	Afghanistan	2020-02-26	1.0	0.0	NaN	NaN	NaN	NaN	...	NaN	597.029	9.59	NaN	NaN	37.746	0.5	64.83	0.511	NaN
3	AFG	Asia	Afghanistan	2020-02-27	1.0	0.0	NaN	NaN	NaN	NaN	...	NaN	597.029	9.59	NaN	NaN	37.746	0.5	64.83	0.511	NaN
4	AFG	Asia	Afghanistan	2020-02-28	1.0	0.0	NaN	NaN	NaN	NaN	...	NaN	597.029	9.59	NaN	NaN	37.746	0.5	64.83	0.511	NaN
5	AFG	Asia	Afghanistan	2020-02-29	1.0	0.0	0.143	NaN	NaN	0.0	...	NaN	597.029	9.59	NaN	NaN	37.746	0.5	64.83	0.511	NaN

6 rows × 60 columns

I am not super familiar with its variable. So let’s check them out with df.columns.

df.columns # untuk panggil list dari nama-nama variabel

Index(['iso_code', 'continent', 'location', 'date', 'total_cases', 'new_cases',
       'new_cases_smoothed', 'total_deaths', 'new_deaths',
       'new_deaths_smoothed', 'total_cases_per_million',
       'new_cases_per_million', 'new_cases_smoothed_per_million',
       'total_deaths_per_million', 'new_deaths_per_million',
       'new_deaths_smoothed_per_million', 'reproduction_rate', 'icu_patients',
       'icu_patients_per_million', 'hosp_patients',
       'hosp_patients_per_million', 'weekly_icu_admissions',
       'weekly_icu_admissions_per_million', 'weekly_hosp_admissions',
       'weekly_hosp_admissions_per_million', 'new_tests', 'total_tests',
       'total_tests_per_thousand', 'new_tests_per_thousand',
       'new_tests_smoothed', 'new_tests_smoothed_per_thousand',
       'positive_rate', 'tests_per_case', 'tests_units', 'total_vaccinations',
       'people_vaccinated', 'people_fully_vaccinated', 'new_vaccinations',
       'new_vaccinations_smoothed', 'total_vaccinations_per_hundred',
       'people_vaccinated_per_hundred', 'people_fully_vaccinated_per_hundred',
       'new_vaccinations_smoothed_per_million', 'stringency_index',
       'population', 'population_density', 'median_age', 'aged_65_older',
       'aged_70_older', 'gdp_per_capita', 'extreme_poverty',
       'cardiovasc_death_rate', 'diabetes_prevalence', 'female_smokers',
       'male_smokers', 'handwashing_facilities', 'hospital_beds_per_thousand',
       'life_expectancy', 'human_development_index', 'excess_mortality'],
      dtype='object')

There’s a huge chunk of variable names! Musta been a super hard work collecting all the data. Shout out to Hannah Ritchie et al.

Aight now let’s check new cases! New cases tends to be volatile, especially if there’s seasonality in the data itself. It is quite common to see seasonality on daily data just because of weekends. Thankfully, there’s new_cases_smoothed which I imagine take into account seasonality by plotting 7-day rolling average. I only take Indonesian data for this post.

indo=df[["iso_code","date","new_cases","new_cases_smoothed"]].query('iso_code == "IDN"')

Plot time!

sns.lineplot(data=indo,x='date',y='new_cases')
sns.lineplot(data=indo,x='date',y='new_cases_smoothed')
plt.xticks(rotation=45)

(array([18322., 18383., 18444., 18506., 18567., 18628., 18687., 18748.,
        18809.]),
 [Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, ''),
  Text(0, 0, '')])

I try to make my own 7-day rolling average by copying codes from here

indo['cases_7day_ave'] = indo.new_cases.rolling(7).mean().shift(-3)
indo.head(10)

	iso_code	date	new_cases	new_cases_smoothed	cases_7day_ave
44074	IDN	2020-03-02	2.0	NaN	NaN
44075	IDN	2020-03-03	0.0	NaN	NaN
44076	IDN	2020-03-04	0.0	NaN	NaN
44077	IDN	2020-03-05	0.0	NaN	0.857143
44078	IDN	2020-03-06	2.0	NaN	2.428571
44079	IDN	2020-03-07	0.0	0.571	3.571429
44080	IDN	2020-03-08	2.0	0.857	4.571429
44081	IDN	2020-03-09	13.0	2.429	4.571429
44082	IDN	2020-03-10	8.0	3.571	9.285714
44083	IDN	2020-03-11	7.0	4.571	13.142857

Which confirms that new_cases_smoothed is indeed 7-day rolling average.

sns.lineplot(data=indo,x='date',y='new_cases')
sns.lineplot(data=indo,x='date',y='new_cases_smoothed')
sns.lineplot(data=indo,x='date',y='cases_7day_ave')
plt.xticks(rotation=45)
plt.legend(['new cases','new cases smoothed','7-day average bikinan sendiri'])
plt.ylabel('kasus')
plt.xlabel('tanggal')

Text(0.5, 0, 'tanggal')

A year and a half is a bit too long (dear god it’s already a year and a half??), so let’s cut it to just 2021.

indo2=indo.query('date>20210101') # ambil hanya setelah 1 Januari 2021
# lalu kita plot persis seperti di atas
sns.lineplot(data=indo2,x='date',y='new_cases')
sns.lineplot(data=indo2,x='date',y='new_cases_smoothed')
plt.xticks(rotation=45)
plt.legend(['new cases','new cases smoothed','7-day average bikinan sendiri'])
plt.ylabel('kasus')
plt.xlabel('tanggal')

Text(0.5, 0, 'tanggal')

Cases is indeed seem to go down even with the smoothed one. But is this because of undertesting? We can also see it from our dataset. We add positive rate to really make sure.

indo=df[["iso_code","date","new_tests","new_tests_smoothed",
        "new_cases","new_cases_smoothed","positive_rate"]].query('iso_code == "IDN"')
indo2=indo.query('date>20210101')

fig, axes = plt.subplots(1, 2, figsize=(18, 10))
fig.suptitle('Data tes baru dan positive rate Indonesia')
sns.lineplot(ax=axes[0],data=indo2,x='date',y='new_tests')
sns.lineplot(ax=axes[0],data=indo2,x='date',y='new_tests_smoothed')
axes[0].tick_params(labelrotation=45)
axes[0].legend(['new tests','new tests smoothed'])
axes[0].set_ylabel('tes baru')
axes[0].set_xlabel('tanggal')
axes[0].set_title('new cases')
sns.lineplot(ax=axes[1],data=indo2,x='date',y='positive_rate')
plt.xticks(rotation=45)
plt.ylabel('0-1')
plt.xlabel('tanggal')
axes[1].set_title('positive rate')

Text(0.5, 1.0, 'positive rate')

And yes test is indeed goes down. At the same time, positive rate seem to be trending down as well. This will depend on how testing is conducted in terms of selecting who gets to be tested and who’s not. We can be sure if we check hospitalisation and death. Unfortunately Indonesian hospitalisation number is non-existent in this dataset.

df.query('iso_code=="IDN"')[['weekly_icu_admissions','weekly_hosp_admissions']]

	weekly_icu_admissions	weekly_hosp_admissions
44074	NaN	NaN
44075	NaN	NaN
44076	NaN	NaN
44077	NaN	NaN
44078	NaN	NaN
...	...	...
44582	NaN	NaN
44583	NaN	NaN
44584	NaN	NaN
44585	NaN	NaN
44586	NaN	NaN

513 rows × 2 columns

On death (Dear God, bless all the lost souls and those who they left), situation is rather gloom.

indo=df[["iso_code","date","new_deaths","new_deaths_smoothed"]].query('iso_code == "IDN"')
indo2=indo.query('date>20210101')
sns.lineplot(data=indo2,x='date',y='new_deaths')
sns.lineplot(data=indo2,x='date',y='new_deaths_smoothed')
plt.xticks(rotation=45)
plt.legend(['kematian baru','kematian baru rerata bergerak 7 hari'])
plt.ylabel('kasus')
plt.xlabel('tanggal')

Text(0.5, 0, 'tanggal')

Judging from the death data, pandemic still far from over. Note that death may follow new cases, hence have a lag in its trending down. However, if we cannot trust test data, death data is also hard to be trusted. I think with unreliable data, it is hard to react on any news really, whether cases go up or down. It is hard to make a good case for the government, because people’s like: low cases: bad data! bad testing!. High cases: Government is stupid!

So yeah. I guess it is helping if we don’t overreact over the new cases because it might not reveal the true state of Indonesian COVID-19 Pandemic situation.

What about vaccination? Judging from all of our graph up there, new cases and positive rate shot up during June-ish. What happen during that month? Delta entrance? What kind of crowdy events happen during that time? What high mobility event took place during that date? The government might let high mobility events to take place amid vaccination program has started. So let me end this blog by posting vaccination speed between countries, including Indonesia.

Hannah Ritchie, Esteban Ortiz-Ospina, Diana Beltekian, Edouard Mathieu, Joe Hasell, Bobbie Macdonald, Charlie Giattino, Cameron Appel, Lucas Rodés-Guirao and Max Roser (2020) - “Coronavirus Pandemic (COVID-19)”. Published online at OurWorldInData.org. Retrieved from: ‘https://ourworldindata.org/coronavirus' [Online Resource] ↩︎

coronavirus COVID-19 python

New update of the recent second wave (or third?) of COVID-19 in Indonesia

Krisna Gupta

Lecturer

Related