popframe.preprocessing.population_filler

Classes

PopulationFiller(*, units, towns, ...[, ...])

A class for filling in population data for towns based on their proximity to geographic units and an adjacency matrix.

TownRow(*, geometry, index, name, level, is_city)

A class representing a town's data row with point geometry, name, administrative level, and city status.

UnitRow(*, geometry, index, population)

A class representing a unit of geographic data with a polygon or multipolygon geometry and a population count.

class popframe.preprocessing.population_filler.UnitRow(*, geometry: Polygon | MultiPolygon, index: int, population: int)[source]

Bases: BaseRow

A class representing a unit of geographic data with a polygon or multipolygon geometry and a population count.

Attributes

geometryshapely.Polygon | shapely.MultiPolygon

Polygonal geometry representing the unit’s boundaries.

populationint

The population residing within the geographic unit.

geometry: Polygon | MultiPolygon
population: int
model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

class popframe.preprocessing.population_filler.TownRow(*, geometry: Point, index: int, name: str, level: str, is_city: bool)[source]

Bases: BaseRow

A class representing a town’s data row with point geometry, name, administrative level, and city status.

Attributes

geometryshapely.Point

The geographic location of the town.

namestr

The name of the town.

levelstr

The administrative level of the town.

is_citybool

A boolean indicating whether the town is a city.

geometry: Point
name: str
level: str
is_city: bool
model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.

class popframe.preprocessing.population_filler.PopulationFiller(*, units: GeoDataFrame[UnitRow], towns: GeoDataFrame[TownRow], adjacency_matrix: DataFrame, city_multiplier: float = 10)[source]

Bases: BaseModel

A class for filling in population data for towns based on their proximity to geographic units and an adjacency matrix.

Attributes

unitsGeoDataFrame[UnitRow]

A GeoDataFrame containing geographic units with population data.

townsGeoDataFrame[TownRow]

A GeoDataFrame containing town data.

adjacency_matrixpd.DataFrame

A DataFrame representing the adjacency matrix between towns, used to calculate median travel times.

city_multiplierfloat, optional

A multiplier applied to towns that are cities for population distribution, defaults to 10.

Methods

validate_units(cls, gdf) -> GeoDataFrame[UnitRow]

Validates that the units input is a valid GeoDataFrame of UnitRow type.

validate_towns(cls, gdf) -> GeoDataFrame[TownRow]

Validates that the towns input is a valid GeoDataFrame of TownRow type.

validate_adjacency_matrix(cls, df) -> pd.DataFrame

Validates that the adjacency matrix is square and matches the town index.

validate_model(self) -> PopulationFiller

Validates that the CRS of towns and units match and that the adjacency matrix matches the town indices.

_get_median_time(self, town_id) -> float

Computes the median time from the adjacency matrix for a given town.

fill(self) -> GeoDataFrame[TownRow]

Fills in population data for the towns based on their proximity to geographic units and the adjacency matrix.

units: GeoDataFrame[UnitRow]
towns: GeoDataFrame[TownRow]
adjacency_matrix: DataFrame
city_multiplier: float
classmethod validate_units(gdf)[source]

Validates that the input is a GeoDataFrame of UnitRow type.

Parameters

gdfGeoDataFrame

The input GeoDataFrame to be validated.

Returns

GeoDataFrame[UnitRow]

A validated GeoDataFrame of UnitRow type.

classmethod validate_towns(gdf)[source]

Validates that the input is a GeoDataFrame of TownRow type.

Parameters

gdfGeoDataFrame

The input GeoDataFrame to be validated.

Returns

GeoDataFrame[TownRow]

A validated GeoDataFrame of TownRow type.

classmethod validate_adjacency_matrix(df)[source]

Validates the adjacency matrix to ensure it is square and that its index matches the columns.

Parameters

dfpd.DataFrame

The adjacency matrix to be validated.

Returns

pd.DataFrame

A validated adjacency matrix.

Raises

AssertionError

If the matrix index and columns do not match.

validate_model()[source]

Validates that the coordinate reference systems (CRS) of the towns and units match, and that the adjacency matrix matches the town indices.

Returns

PopulationFiller

The validated PopulationFiller instance.

Raises

AssertionError

If the CRS of the towns and units do not match, or if the matrix indices and town indices do not match.

_get_median_time(town_id) float[source]

Calculates the median time from the adjacency matrix for a given town.

Parameters

town_idint

The ID of the town for which to calculate the median time.

Returns

float

The median travel time for the town.

fill() GeoDataFrame[TownRow][source]

Distributes the population from geographic units to towns based on the proximity and population distribution model.

Returns

GeoDataFrame[TownRow]

A GeoDataFrame with updated population data for the towns.

model_computed_fields: ClassVar[dict[str, ComputedFieldInfo]] = {}

A dictionary of computed field names and their corresponding ComputedFieldInfo objects.