Package: scrapers.phw_covid_statement

Submodules:

get_data_url

Use Selenium to open a Safari window and get the url link of most recent dataset

phw_scraper

Module containing functions required for attaining and cleaning data on COVID cases in each Local Authority from the Public Health Wales dashboard.

backend.scrapers.phw_covid_statement.phw_scraper.area_code(laName)

For Local Authority name given, returns the corresponding code.

backend.scrapers.phw_covid_statement.phw_scraper.clean_data(input_path, output_path)

Given an input path, will read the raw data from there, select the sheet with the COVID cases data, get the most recent data and match it to the correct local authority codes.

Parameters
  • input_path (str) – File path to read the raw data from

  • output_path (str) – File path to write the cleaned data to

Returns

String of most recent date data was collected from, and a list of the column names written to .csv.

Return type

str, list

backend.scrapers.phw_covid_statement.phw_scraper.get_phw_data(output_path)

Downloads PHW dashboard data Excel file, saves as xlsx to given output path.

Runs the function get_data_link.

Parameters

output_path (str) – File path to raw data output location

backend.scrapers.phw_covid_statement.phw_scraper.run_phw_scraper(raw_folder, cleaned_folder)

Get latest data from PHW.

This function will get the latest data, write it to the raw folder as a .xlsx. It will then clean it it, and write the cleaned data to the clean folder.

Parameters
  • raw_folder (str) – File path to write the raw scraped data to.

  • cleaned_folder (str) – File path to write the cleaned data to.