Package: backend¶
The backend module manages all data loading, cleaning and processing tasks associated with generating the json object for the frontend visualisation.
Submodules¶
generate_json¶
This module is used to write and define the content and structure of the final data.json file that is used to plot the data on the frontend.
Notes
Running this module as __main__ will generate the .json file and write it to the data folder in the frontend.
If you are adding new variables, you must first define it as a Variable instance, and then add the name of the varible instance to either the LA_VARBS or LSOA_VARBS list, depending on whether it is an LA or LSOA variable. Prior to doing this you must also have added the data source to the live or static modules in the datasets package, so that they appear in the corresponding MasterDataset object.
The pd.Series provided to the Variable class instances are columns from the instances of MasterDataset that are imported from the datasets package. These are:
LA_STATIC_MASTER (from datasets.static) LSOA_STATIC_MASTER (from datasets.static) LA_LIVE_MASTER (from datasets.live)
-
class
backend.generate_json.
DataDashboard
(la_data: backend.generate_json.Variables, lsoa_data: backend.generate_json.Variables)¶ Bases:
object
Transforms existing Variables objects into one object that can be written to .json.
-
la_data
: Variables = None¶
-
lsoa_data
: Variables = None¶
-
to_json
()¶ Creates the final dict object to write to json.
Notes
The metadata written to json here is only the LA metadata. This is because we assume that any LA level data is also available at LSOA level, and so the LA metadata will cover all the variables available.
- Returns
Dictionary with three keys: variables, LAs, LSOAs. The values are lists of dictionaries containing the data as defined in Variables.
- Return type
dict
-
write
()¶ Writes out the variables in the required json format to the frontend.
Notes
The frontend data folder is assumed to be: frontend/map/data/data.json
-
-
class
backend.generate_json.
Variable
(data: pandas.core.series.Series, label: str, data_class: str, invert: bool, data_type: str, la_and_lsoa: bool = True, data_transformed_: pandas.core.series.Series = None)¶ Bases:
object
This class defines the metadata and transformations needed for a variable. It will generate the transformed variable, and will also generate and object with the variable’s associated metadata.
Notes
Not all transformations can be applied to all data types. For instance, data of type rank cannot be transformed to a percentage. In these cases, the variable data is returned as itself.
-
data
¶ The variable data. Index should be set as area name and area code.
- Type
pd.Series
-
label
¶ Human readable label to be presented on the map.
- Type
str
-
data_class
¶ Accepts options ‘support’ or ‘challenge’
- Type
str
-
invert
¶ Does the direction of the data need to be inverted before mapping?
- Type
bool
-
data_type
¶ Is the data a percentage, a count or a rank?
- Type
str
-
la_and_lsoa
¶ Is it available at both LA and LSOA resolution? By default, True.
- Type
bool
-
data_transformed\_
Set by calling the transform method. By default, None.
- Type
pd.Series
-
data
: pd.Series = None¶
-
data_class
: str = None¶
-
data_transformed_
: pandas.core.series.Series = None¶
-
data_type
: str = None¶
-
invert
: bool = None¶
-
invert_data
()¶ Invert data direction AFTER transformation and set self.data_transformed_ as the output.
Notes
Data of type percentage, count and per100k are all inverted by subtracting them from 100. This only holds if inversion is applied after transform_per100. Data of type rank has the rank order reversed.
- Raises
Exception – When a data type is defined that is not yet supported.
-
la_and_lsoa
: bool = True¶
-
label
: str = None¶
-
meta_to_json
()¶ Creates a dict of the metadata, containing name, label, class, and lsoa. This is used in the variables section of the .json file.
- Returns
Dictionary with keys name, label, class and lsoa.
- Return type
dict
-
new_name
()¶ Assuming all variables are originally named name_datatype this method removes the _datatype and returns just name as str.
-
property
res
¶ Guess and set the resolution of the data depending on no. of rows.
-
transform
()¶ Applies transformation methods to the variable and sets the data_transformed_ attribute.
- Returns
Returns self
- Return type
-
transform_per100
()¶ Based on variable type, perform transforms to percentage if possible, and sets self.data_transformed_ as the output.
Notes
The percentage for count type data will be as a percentage of the population variable at that geography. For per100k this will just be divided by 1000. All other data types (percentage, density, rank) they will be returned as given.
- Raises
Exception – When a data type is defined that is not yet supported.
-
property
transformed_data
¶ Returns transformed data. Will return None if transform method has not been applied.
-
-
class
backend.generate_json.
Variables
(variables: Sequence[backend.generate_json.Variable])¶ Bases:
object
This dataclass turns a list of variables into one overall dictionary object of all the variable data attached to each geographic area.
-
variables
¶ A sequence of the variables of the same geographic resolution to be transformed into a level in the json file.
- Type
Sequence[Variable]
-
data_to_json
()¶ Transforms the variables, merges them to one df, rounds them to 3dp, then generates a list of dicts that represent each row (i.e. geographic area).
- Returns
List of dicts, where the keys in each dict are variable names and the values are the values of each varb. This includes the area name and code as keys.
- Return type
list
-
property
is_valid
¶ Returns True if all the variables are the same geographic resolution
-
metadata_to_json
()¶ Returns a list of metadata dictionaries for each variable
-
variables
: Sequence[Variable] = None¶
-
run_scrapers¶
When run as __main__ this module will run the police coders scraper, and the PHW scraper.
Notes
- Running this script will execute the following actions:
Get updated data using run_police_coders_scraper () & run_phw_scraper()
Save scraping data archived by date (csv)
Overwrite most recent data (csv)
Produce groupCount layer as a count of groups per area