Package: backend

The backend module manages all data loading, cleaning and processing tasks associated with generating the json object for the frontend visualisation.

Submodules

generate_json

This module is used to write and define the content and structure of the final data.json file that is used to plot the data on the frontend.

Notes

Running this module as __main__ will generate the .json file and write it to the data folder in the frontend.

If you are adding new variables, you must first define it as a Variable instance, and then add the name of the varible instance to either the LA_VARBS or LSOA_VARBS list, depending on whether it is an LA or LSOA variable. Prior to doing this you must also have added the data source to the live or static modules in the datasets package, so that they appear in the corresponding MasterDataset object.

The pd.Series provided to the Variable class instances are columns from the instances of MasterDataset that are imported from the datasets package. These are:

LA_STATIC_MASTER (from datasets.static) LSOA_STATIC_MASTER (from datasets.static) LA_LIVE_MASTER (from datasets.live)

class backend.generate_json.DataDashboard(la_data: backend.generate_json.Variables, lsoa_data: backend.generate_json.Variables)

Bases: object

Transforms existing Variables objects into one object that can be written to .json.

la_data

A Variables object of all the LA Variables to be included.

Type

Variables

lsoa_data

A Variables object of all the LSOA Variables to be included.

Type

Variables

la_data: Variables = None
lsoa_data: Variables = None
to_json()

Creates the final dict object to write to json.

Notes

The metadata written to json here is only the LA metadata. This is because we assume that any LA level data is also available at LSOA level, and so the LA metadata will cover all the variables available.

Returns

Dictionary with three keys: variables, LAs, LSOAs. The values are lists of dictionaries containing the data as defined in Variables.

Return type

dict

write()

Writes out the variables in the required json format to the frontend.

Notes

The frontend data folder is assumed to be: frontend/map/data/data.json

class backend.generate_json.Variable(data: pandas.core.series.Series, label: str, data_class: str, invert: bool, data_type: str, la_and_lsoa: bool = True, data_transformed_: pandas.core.series.Series = None)

Bases: object

This class defines the metadata and transformations needed for a variable. It will generate the transformed variable, and will also generate and object with the variable’s associated metadata.

Notes

Not all transformations can be applied to all data types. For instance, data of type rank cannot be transformed to a percentage. In these cases, the variable data is returned as itself.

data

The variable data. Index should be set as area name and area code.

Type

pd.Series

label

Human readable label to be presented on the map.

Type

str

data_class

Accepts options ‘support’ or ‘challenge’

Type

str

invert

Does the direction of the data need to be inverted before mapping?

Type

bool

data_type

Is the data a percentage, a count or a rank?

Type

str

la_and_lsoa

Is it available at both LA and LSOA resolution? By default, True.

Type

bool

data_transformed\_

Set by calling the transform method. By default, None.

Type

pd.Series

data: pd.Series = None
data_class: str = None
data_transformed_: pandas.core.series.Series = None
data_type: str = None
invert: bool = None
invert_data()

Invert data direction AFTER transformation and set self.data_transformed_ as the output.

Notes

Data of type percentage, count and per100k are all inverted by subtracting them from 100. This only holds if inversion is applied after transform_per100. Data of type rank has the rank order reversed.

Raises

Exception – When a data type is defined that is not yet supported.

la_and_lsoa: bool = True
label: str = None
meta_to_json()

Creates a dict of the metadata, containing name, label, class, and lsoa. This is used in the variables section of the .json file.

Returns

Dictionary with keys name, label, class and lsoa.

Return type

dict

new_name()

Assuming all variables are originally named name_datatype this method removes the _datatype and returns just name as str.

property res

Guess and set the resolution of the data depending on no. of rows.

transform()

Applies transformation methods to the variable and sets the data_transformed_ attribute.

Returns

Returns self

Return type

Variable

transform_per100()

Based on variable type, perform transforms to percentage if possible, and sets self.data_transformed_ as the output.

Notes

The percentage for count type data will be as a percentage of the population variable at that geography. For per100k this will just be divided by 1000. All other data types (percentage, density, rank) they will be returned as given.

Raises

Exception – When a data type is defined that is not yet supported.

property transformed_data

Returns transformed data. Will return None if transform method has not been applied.

class backend.generate_json.Variables(variables: Sequence[backend.generate_json.Variable])

Bases: object

This dataclass turns a list of variables into one overall dictionary object of all the variable data attached to each geographic area.

variables

A sequence of the variables of the same geographic resolution to be transformed into a level in the json file.

Type

Sequence[Variable]

data_to_json()

Transforms the variables, merges them to one df, rounds them to 3dp, then generates a list of dicts that represent each row (i.e. geographic area).

Returns

List of dicts, where the keys in each dict are variable names and the values are the values of each varb. This includes the area name and code as keys.

Return type

list

property is_valid

Returns True if all the variables are the same geographic resolution

metadata_to_json()

Returns a list of metadata dictionaries for each variable

variables: Sequence[Variable] = None

run_scrapers

When run as __main__ this module will run the police coders scraper, and the PHW scraper.

Notes

Running this script will execute the following actions:
  1. Get updated data using run_police_coders_scraper () & run_phw_scraper()

  2. Save scraping data archived by date (csv)

  3. Overwrite most recent data (csv)

  4. Produce groupCount layer as a count of groups per area