Full Package Reference

This page contains the full reference for the gerrychain package, and is included primarily for the sake of developers trying to hunt down errors or understand the codebase. For a more user-friendly introduction to the usage of the package, please see the API Reference.

Accept

This module provides the main acceptance function used in ReCom Markov chains.

Dependencies:

  • random: For random number generation for probabilistic acceptance.

Last Updated: 11 Jan 2024

gerrychain.accept.always_accept(partition: Partition) bool[source]
gerrychain.accept.cut_edge_accept(partition: Partition) bool[source]

Always accepts the flip if the number of cut_edges increases. Otherwise, uses the Metropolis criterion to decide.

Parameters

partition (Partition) – The current partition to accept a flip from.

Returns

True if accepted, False to remain in place

Return type

bool

Chain

This module provides the MarkovChain class, which is designed to facilitate the creation and iteration of Markov chains in the context of political redistricting and gerrymandering analysis. It allows for the exploration of different districting plans based on specified constraints and acceptance criteria.

Key Components:

  • MarkovChain: The main class used for creating and iterating over Markov chain states.

  • Validator: A helper class for validating proposed states in the Markov chain. See Validator for more details.

Usage: The primary use of this module is to create an instance of MarkovChain with appropriate parameters like proposal function, constraints, acceptance function, and initial state, and then to iterate through the states of the Markov chain, yielding a new proposal at each step.

Dependencies:

  • typing: Used for type hints.

Last Updated: 11 Jan 2024

class gerrychain.chain.MarkovChain(proposal: Callable, constraints: Union[Iterable[Callable], Validator, Iterable[Bounds], Callable], accept: Callable, initial_state: Optional[Partition], total_steps: int)[source]

Bases: object

MarkovChain is a class that creates an iterator for iterating over the states of a Markov chain run in a gerrymandering analysis context.

It allows for the generation of a sequence of partitions (states) of a political districting plan, where each partition represents a possible state in the Markov chain.

Example usage:

chain = MarkovChain(proposal, constraints, accept, initial_state, total_steps)
for state in chain:
    # Do whatever you want - print output, compute scores, ...
Parameters
  • proposal (Callable) – Function proposing the next state from the current state.

  • constraints (Union[Iterable[Callable], Validator, Iterable[Bounds], Callable]) – A function with signature Partition -> bool determining whether the proposed next state is valid (passes all binary constraints). Usually this is a Validator class instance.

  • accept (Callable) – Function accepting or rejecting the proposed state. In the most basic use case, this always returns True. But if the user wanted to use a Metropolis-Hastings acceptance rule, this is where you would implement it.

  • initial_state (Optional[Partition]) – Initial gerrychain.partition.Partition class.

  • total_steps (int) – Number of steps to run.

Returns

None

Raises

ValueError – If the initial_state is not valid according to the constraints.

property constraints: Validator

Read_only alias for the is_valid property. Returns the constraints of the Markov chain.

Returns

The constraints of the Markov chain.

Return type

String

with_progress_bar()[source]

Wraps the Markov chain in a tqdm progress bar.

Useful for long-running Markov chains where you want to keep track of the progress. Requires the tqdm package to be installed.

Returns

A tqdm-wrapped Markov chain.

Grid

This module provides a Grid class used for creating and manipulating grid partitions. It’s part of the GerryChain suite, designed to facilitate experiments with redistricting plans without the need for extensive data processing. This module relies on NetworkX for graph operations and integrates with GerryChain’s Partition class.

Dependencies:

  • math: For math.floor() function.

  • networkx: For graph operations with using the graph structure in

    Graph.

  • typing: Used for type hints.

class gerrychain.grid.Grid(dimensions: Optional[Tuple[int, int]] = None, with_diagonals: bool = False, assignment: Optional[Dict] = None, updaters: Optional[Dict[str, Callable]] = None, parent: Optional[Grid] = None, flips: Optional[Dict[Tuple[int, int], int]] = None)[source]

Bases: Partition

The Grid class represents a grid partitioned into districts. It is useful for running little experiments with GerryChain without needing to do any data processing or cleaning to get started.

Example usage:

grid = Grid((10,10))

The nodes of grid.graph are labelled by tuples (i,j), for 0 <= i <= 10 and 0 <= j <= 10. Each node has an area of 1 and each edge has shared_perim 1.

If the updaters are not specified, the default updaters are used, which are as follows:

default_updaters = {
    "cut_edges": cut_edges,
    "population": Tally("population"),
    "perimeter": perimeter,
    "exterior_boundaries": exterior_boundaries,
    "interior_boundaries": interior_boundaries,
    "boundary_nodes": boundary_nodes,
    "area": Tally("area", alias="area"),
    "polsby_popper": polsby_popper,
    "cut_edges_by_part": cut_edges_by_part,
}
Parameters
  • dimensions (Tuple[int, int], optional) – The grid dimensions (rows, columns), defaults to None.

  • with_diagonals (bool, optional) – If True, includes diagonal connections, defaults to False.

  • assignment (Dict, optional) – Node-to-district assignments, defaults to None.

  • updaters (Dict[str, Callable], optional) – Custom updater functions, defaults to None.

  • parent (Grid, optional) – Parent Grid object for inheritance, defaults to None.

  • flips (Dict[Tuple[int, int], int], optional) – Node flips for partition changes, defaults to None.

Raises

Exception – If neither dimensions nor parent is provided.

as_list_of_lists()[source]

Returns the grid as a list of lists (like a matrix), where the (i,j)th entry is the assigned district of the node in position (i,j) on the grid.

Returns

List of lists representing the grid.

Return type

List[List[int]]

assignment
default_updaters = {'area': <gerrychain.updaters.tally.Tally object>, 'boundary_nodes': <function boundary_nodes>, 'cut_edges': <function cut_edges>, 'cut_edges_by_part': <function cut_edges_by_part>, 'exterior_boundaries': <function exterior_boundaries>, 'interior_boundaries': <function interior_boundaries>, 'perimeter': <function perimeter>, 'polsby_popper': <function polsby_popper>, 'population': <gerrychain.updaters.tally.Tally object>}
edge_flows
flips
flows
graph
parent
subgraphs
updaters
gerrychain.grid.color_half(node: Tuple[int, int], threshold: int) int[source]

Assigns a color (as an integer) to a node based on its x-coordinate.

This function is used to partition the grid into two parts based on a given threshold. Nodes with an x-coordinate less than or equal to the threshold are assigned one color, and nodes with an x-coordinate greater than the threshold are assigned another.

Parameters
  • node (Tuple[int, int]) – The node to color, represented as a tuple of coordinates (x, y).

  • threshold (int) – The x-coordinate value that determines the color assignment.

Returns

An integer representing the color of the node. Returns 0 for nodes with x-coordinate less than or equal to the threshold, and 1 otherwise.

Return type

int

gerrychain.grid.color_quadrants(node: Tuple[int, int], thresholds: Tuple[int, int]) int[source]

Assigns a color (as an integer) to a node based on its position relative to specified threshold coordinates, effectively dividing the grid into four quadrants.

The function uses two threshold values (one for each axis) to determine the color. Each combination of being higher or lower than the threshold on each axis results in a different color.

Parameters
  • node (Tuple[int, int]) – The node to color, represented as a tuple of coordinates (x, y).

  • thresholds (Tuple[int, int]) – A tuple of two integers representing the threshold coordinates (x_threshold, y_threshold).

Returns

An integer representing the color of the node, determined by its quadrant.

Return type

int

gerrychain.grid.create_grid_graph(dimensions: Tuple[int, int], with_diagonals: bool) Graph[source]

Creates a grid graph with the specified dimensions. Optionally includes diagonal connections between nodes.

Parameters
  • dimensions (Tuple[int, int]) – The grid dimensions (rows, columns).

  • with_diagonals (bool) – If True, includes diagonal connections.

Returns

A grid graph.

Return type

Graph

Raises

ValueError – If the dimensions are not a tuple of length 2.

gerrychain.grid.get_boundary_perim(node: Tuple[int, int], dimensions: Tuple[int, int]) int[source]

Determines the boundary perimeter of a node on the grid. The boundary perimeter is the number of sides of the node that are on the boundary of the grid.

Parameters
  • node (Tuple[int, int]) – The node to check.

  • dimensions (Tuple[int, int]) – The dimensions of the grid.

Returns

The boundary perimeter of the node.

Return type

int

gerrychain.grid.give_constant_attribute(graph: Graph, attribute: Any, value: Any) None[source]

Sets the specified attribute to the specified value for all nodes in the graph.

Parameters
  • graph (Graph) – The graph to modify.

  • attribute (Any) – The attribute to set.

  • value (Any) – The value to set the attribute to.

Returns

None

gerrychain.grid.tag_boundary_nodes(graph: Graph, dimensions: Tuple[int, int]) None[source]

Adds the boolean attribute boundary_node to each node in the graph. If the node is on the boundary of the grid, that node also gets the attribute boundary_perim which is determined by the function get_boundary_perim().

Parameters
  • graph (Graph) – The graph to modify.

  • dimensions (Tuple[int, int]) – The dimensions of the grid.

Returns

None

Metagraph

This module provides the main tools for interacting with the metagraph of partitions. The metagraph of partitions is the set of partitions that are reachable from the current partition by a single flip.

Dependencies:

  • itertools: Used for product() function.

  • typing: Used for type hints.

Last Updated: 11 Jan 2024

gerrychain.metagraph.all_cut_edge_flips(partition: Partition) Iterator[Dict][source]

Generate all possible flips of cut edges in a partition without any constraints.

Parameters

partition (Partition) – The partition object.

Returns

An iterator that yields dictionaries representing the flipped edges.

Return type

Iterator[Dict]

gerrychain.metagraph.all_valid_flips(partition: Partition, constraints: Union[Iterable[Callable], Callable]) Iterator[Dict][source]

Generate all valid flips for a given partition subject to the prescribed constraints.

Parameters
  • partition (Partition) – The initial partition.

  • constraints (Union[Iterable[Callable], Callable]) – The constraints to be satisfied. Can be a single constraint or an iterable of constraints.

Returns

An iterator that yields dictionaries representing valid flips.

Return type

Iterator[Dict]

gerrychain.metagraph.all_valid_states_one_flip_away(partition: Partition, constraints: Union[Iterable[Callable], Callable]) Iterator[Partition][source]

Generates all valid Partitions that differ from the given partition by one flip. These are the given partition’s neighbors in the metagraph of partitions. (The metagraph of partitions is the set of partitions that is reachable from the given partition by a single flip under the prescribed constraints.)

Parameters
  • partition (Partition) – The initial partition.

  • constraints (Union[Iterable[Callable], Callable]) – Constraints to determine the validity of a partition. It can be a single callable or an iterable of callables.

Returns

An iterator that yields all valid partitions that differ from the given partition by one flip.

Return type

Iterator[Partition]

gerrychain.metagraph.metagraph_degree(partition: Partition, constraints: Union[Iterable[Callable], Callable]) int[source]

Calculate the degree of the node in the metagraph of the given partition. That is to say, compute how many possible valid states are reachable from the state given by partition in a single flip subject to the prescribed constraints.

Parameters
  • partition (Partition) – The partition object representing the current state.

  • constraints (Union[Iterable[Callable], Callable]) – The constraints to be applied to the partition. It can be a single constraint or an iterable of constraints.

Returns

The degree of the partition node in the metagraph.

Return type

int

Tree

This module provides tools and algorithms for manipulating and analyzing graphs, particularly focused on partitioning graphs based on population data. It leverages the NetworkX library to handle graph structures and implements various algorithms for graph partitioning and tree traversal.

Key functionalities include:

  • Predecessor and successor functions for graph traversal using breadth-first search.

  • Implementation of random and uniform spanning trees for graph partitioning.

  • The PopulatedGraph class, which represents a graph with additional population data, and methods for assessing and modifying this data.

  • Functions for finding balanced edge cuts in a populated graph, either through contraction or memoization techniques.

  • A suite of functions (bipartition_tree, recursive_tree_part, get_seed_chunks, etc.) for partitioning graphs into balanced subsets based on population targets and tolerances.

  • Utility functions like get_max_prime_factor_less_than and recursive_seed_part_inner to assist in complex partitioning tasks.

Dependencies:

  • networkx: Used for graph data structure and algorithms.

  • random: Provides random number generation for probabilistic approaches.

  • typing: Used for type hints.

Last Updated: 30 Jan 2024

exception gerrychain.tree.BalanceError[source]

Bases: Exception

Raised when a balanced cut cannot be found.

exception gerrychain.tree.BipartitionWarning[source]

Bases: UserWarning

Generally raised when it is proving difficult to find a balanced cut.

class gerrychain.tree.Cut(edge, subset)

Bases: tuple

Represents a cut in a graph.

Create new instance of Cut(edge, subset)

edge

The edge where the cut is made.

subset

The subset of nodes on one side of the cut.

class gerrychain.tree.PopulatedGraph(graph: Graph, populations: Dict, ideal_pop: Union[float, int], epsilon: float)[source]

Bases: object

A class representing a graph with population information.

Variables
  • graph – The underlying graph structure.

  • subsets – A dictionary mapping nodes to their subsets.

  • population – A dictionary mapping nodes to their populations.

  • tot_pop – The total population of the graph.

  • ideal_pop – The ideal population for each district.

  • epsilon – The tolerance for population deviation from the ideal population within each district.

Parameters
  • graph (nx.Graph) – The underlying graph structure.

  • populations (Dict) – A dictionary mapping nodes to their populations.

  • ideal_pop (Union[float, int]) – The ideal population for each district.

  • epsilon (float) – The tolerance for population deviation as a percentage of the ideal population within each district.

contract_node(node, parent) None[source]
degree(node) int[source]
has_ideal_population(node) bool[source]
exception gerrychain.tree.ReselectException[source]

Bases: Exception

Raised when the tree-splitting algorithm is unable to find a balanced cut after some maximum number of attempts, but the user has allowed the algorithm to reselect the pair of districts from parent graph to try and recombine.

gerrychain.tree.bipartition_tree(graph: ~networkx.classes.graph.Graph, pop_col: str, pop_target: ~typing.Union[int, float], epsilon: float, node_repeats: int = 1, spanning_tree: ~typing.Optional[~networkx.classes.graph.Graph] = None, spanning_tree_fn: ~typing.Callable = <function random_spanning_tree>, weight_dict: ~typing.Optional[~typing.Dict] = None, balance_edge_fn: ~typing.Callable = <function find_balanced_edge_cuts_memoization>, choice: ~typing.Callable = <bound method Random.choice of <random.Random object>>, max_attempts: ~typing.Optional[int] = 100000, allow_pair_reselection: bool = False) Set[source]

This function finds a balanced 2 partition of a graph by drawing a spanning tree and finding an edge to cut that leaves at most an epsilon imbalance between the populations of the parts. If a root fails, new roots are tried until node_repeats in which case a new tree is drawn.

Builds up a connected subgraph with a connected complement whose population is epsilon * pop_target away from pop_target.

Parameters
  • graph (nx.Graph) – The graph to partition.

  • pop_col (str) – The node attribute holding the population of each node.

  • pop_target (Union[int, float]) – The target population for the returned subset of nodes.

  • epsilon (float) – The allowable deviation from pop_target (as a percentage of pop_target) for the subgraph’s population.

  • node_repeats (int, optional) – A parameter for the algorithm: how many different choices of root to use before drawing a new spanning tree. Defaults to 1.

  • spanning_tree (Optional[nx.Graph], optional) – The spanning tree for the algorithm to use (used when the algorithm chooses a new root and for testing).

  • spanning_tree_fn (Callable, optional) – The random spanning tree algorithm to use if a spanning tree is not provided. Defaults to random_spanning_tree().

  • weight_dict (Optional[Dict], optional) – A dictionary of weights for the spanning tree algorithm. Defaults to None.

  • balance_edge_fn (Callable, optional) – The function to find balanced edge cuts. Defaults to find_balanced_edge_cuts_memoization().

  • choice (Callable, optional) – The function to make a random choice. Can be substituted for testing. Defaults to random.choice().

  • max_attempts (Optional[int], optional) – The maximum number of attempts that should be made to bipartition. Defaults to 10000.

  • allow_pair_reselection (bool, optional) – Whether we would like to return an error to the calling function to ask it to reselect the pair of nodes to try and recombine. Defaults to False.

Returns

A subset of nodes of graph (whose induced subgraph is connected). The other part of the partition is the complement of this subset.

Return type

Set

Raises
  • BipartitionWarning – If a possible cut cannot be found after 50 attempts.

  • RuntimeError – If a possible cut cannot be found after the maximum number of attempts given by max_attempts.

gerrychain.tree.bipartition_tree_random(graph: ~networkx.classes.graph.Graph, pop_col: str, pop_target: ~typing.Union[int, float], epsilon: float, node_repeats: int = 1, repeat_until_valid: bool = True, spanning_tree: ~typing.Optional[~networkx.classes.graph.Graph] = None, spanning_tree_fn: ~typing.Callable = <function random_spanning_tree>, balance_edge_fn: ~typing.Callable = <function find_balanced_edge_cuts_memoization>, choice: ~typing.Callable = <bound method Random.choice of <random.Random object>>, max_attempts: ~typing.Optional[int] = 100000) Optional[Set[Any]][source]

This is like bipartition_tree() except it chooses a random balanced cut, rather than the first cut it finds.

This function finds a balanced 2 partition of a graph by drawing a spanning tree and finding an edge to cut that leaves at most an epsilon imbalance between the populations of the parts. If a root fails, new roots are tried until node_repeats in which case a new tree is drawn.

Builds up a connected subgraph with a connected complement whose population is epsilon * pop_target away from pop_target.

Parameters
  • graph (nx.Graph) – The graph to partition

  • pop_col (str) – The node attribute holding the population of each node

  • pop_target (Union[int, float]) – The target population for the returned subset of nodes

  • epsilon (float) – The allowable deviation from pop_target (as a percentage of pop_target) for the subgraph’s population

  • node_repeats (int) – A parameter for the algorithm: how many different choices of root to use before drawing a new spanning tree. Defaults to 1.

  • repeat_until_valid (bool, optional) – Determines whether to keep drawing spanning trees until a tree with a balanced cut is found. If True, a set of nodes will always be returned; if False, None will be returned if a valid spanning tree is not found on the first try. Defaults to True.

  • spanning_tree (Optional[nx.Graph], optional) – The spanning tree for the algorithm to use (used when the algorithm chooses a new root and for testing). Defaults to None.

  • spanning_tree_fn (Callable, optional) – The random spanning tree algorithm to use if a spanning tree is not provided. Defaults to random_spanning_tree().

  • balance_edge_fn (Callable, optional) – The algorithm used to find balanced cut edges. Defaults to find_balanced_edge_cuts_memoization().

  • choice (Callable, optional) – The random choice function. Can be substituted for testing. Defaults to random.choice().

  • max_attempts (Optional[int], optional) – The max number of attempts that should be made to bipartition. Defaults to None.

Returns

A subset of nodes of graph (whose induced subgraph is connected) or None if a valid spanning tree is not found.

Return type

Union[Set[Any], None]

gerrychain.tree.find_balanced_edge_cuts_contraction(h: ~gerrychain.tree.PopulatedGraph, choice: ~typing.Callable = <bound method Random.choice of <random.Random object>>) List[Cut][source]

Find balanced edge cuts using contraction.

Parameters
  • h (PopulatedGraph) – The populated graph.

  • choice (Callable, optional) – The function used to make random choices.

Returns

A list of balanced edge cuts.

Return type

List[Cut]

gerrychain.tree.find_balanced_edge_cuts_memoization(h: ~gerrychain.tree.PopulatedGraph, choice: ~typing.Callable = <bound method Random.choice of <random.Random object>>) List[Cut][source]

Find balanced edge cuts using memoization.

This function takes a PopulatedGraph object and a choice function as input and returns a list of balanced edge cuts. A balanced edge cut is defined as a cut that divides the graph into two subsets, such that the population of each subset is close to the ideal population defined by the PopulatedGraph object.

Parameters
  • h (PopulatedGraph) – The PopulatedGraph object representing the graph.

  • choice (Callable, optional) – The choice function used to select the root node.

Returns

A list of balanced edge cuts.

Return type

List[Cut]

gerrychain.tree.get_max_prime_factor_less_than(n: int, ceil: int) Optional[int][source]
Helper function for recursive_seed_part_inner. Returns the largest prime factor of n

less than ceil, or None if all are greater than ceil.

Parameters
  • n (int) – The number to find the largest prime factor for.

  • ceil (int) – The upper limit for the largest prime factor.

Returns

The largest prime factor of n less than ceil, or None if all are greater than ceil.

Return type

Optional[int]

gerrychain.tree.get_seed_chunks(graph: ~networkx.classes.graph.Graph, num_chunks: int, num_dists: int, pop_target: ~typing.Union[int, float], pop_col: str, epsilon: float, node_repeats: int = 1, method: ~typing.Callable = functools.partial(<function bipartition_tree_random>, max_attempts=10000)) List[List[int]][source]

Helper function for recursive_seed_part. Partitions the graph into num_chunks chunks, balanced within new_epsilon <= epsilon of a balanced target population.

Parameters
  • graph (nx.Graph) – The graph

  • num_chunks (int) – The number of chunks to partition the graph into

  • num_dists (int) – The number of districts

  • pop_target (Union[int, float]) – The target population of the districts (not of the chunks)

  • pop_col (str) – Node attribute key holding population data

  • epsilon (float) – How far (as a percentage of pop_target) from pop_target the parts of the partition can be

  • node_repeats (int, optional) – Parameter for bipartition_tree_random() to use. Defaults to 1.

  • method (Callable, optional) – The method to use for bipartitioning the graph. Defaults to bipartition_tree_random()

Returns

New assignments for the nodes of graph.

Return type

List[List[int]]

gerrychain.tree.predecessors(h: Graph, root: Any) Dict[source]
gerrychain.tree.random_spanning_tree(graph: Graph, weight_dict: Optional[Dict] = None) Graph[source]

Builds a spanning tree chosen by Kruskal’s method using random weights.

Parameters
  • graph (nx.Graph) – The input graph to build the spanning tree from. Should be a Networkx Graph.

  • weight_dict (Optional[Dict], optional) – Dictionary of weights to add to the random weights used in region-aware variants.

Returns

The maximal spanning tree represented as a Networkx Graph.

Return type

nx.Graph

gerrychain.tree.recursive_seed_part(graph: ~networkx.classes.graph.Graph, parts: ~typing.Sequence, pop_target: ~typing.Union[float, int], pop_col: str, epsilon: float, method: ~typing.Callable = functools.partial(<function bipartition_tree>, max_attempts=10000), node_repeats: int = 1, n: ~typing.Optional[int] = None, ceil: ~typing.Optional[int] = None) Dict[source]

Returns a partition with num_dists districts balanced within epsilon of pop_target by recursively splitting graph using recursive_seed_part_inner.

Parameters
  • graph (nx.Graph) – The graph

  • parts (Sequence) – Iterable of part labels (like [0,1,2] or range(4)

  • pop_target (Union[float, int]) – Target population for each part of the partition

  • pop_col (str) – Node attribute key holding population data

  • epsilon (float) – How far (as a percentage of pop_target) from pop_target the parts of the partition can be

  • method (Callable, optional) – Function used to find balanced partitions at the 2-district level Defaults to bipartition_tree()

  • node_repeats (int, optional) – Parameter for bipartition_tree() to use. Defaults to 1.

  • n (Optional[int], optional) – Either a positive integer (greater than 1) or None. If n is a positive integer, this function will recursively create a seed plan by either biting off districts from graph or dividing graph into n chunks and recursing into each of these. If n is None, this function prime factors ``num_dists``=n_1*n_2*…*n_k (n_1 > n_2 > … n_k) and recursively partitions graph into n_1 chunks. Defaults to None.

  • ceil (Optional[int], optional) – Either a positive integer (at least 2) or None. Relevant only if n is None. If ceil is a positive integer then finds the largest factor of num_dists less than or equal to ceil, and recursively splits graph into that number of chunks, or bites off a district if that number is 1. Defaults to None.

Returns

New assignments for the nodes of graph.

Return type

dict

gerrychain.tree.recursive_seed_part_inner(graph: ~networkx.classes.graph.Graph, num_dists: int, pop_target: ~typing.Union[float, int], pop_col: str, epsilon: float, method: ~typing.Callable = functools.partial(<function bipartition_tree>, max_attempts=10000), node_repeats: int = 1, n: ~typing.Optional[int] = None, ceil: ~typing.Optional[int] = None) List[Set][source]

Inner function for recursive_seed_part. Returns a partition with num_dists districts balanced within epsilon of pop_target. Splits graph into num_chunks chunks, and then recursively splits each chunk into num_dists/num_chunks chunks. The number num_chunks of chunks is chosen based on n and ceil as follows:

  • If n is None, and ceil is None, num_chunks is the largest prime factor of num_dists.

  • If n is None and ceil is an integer at least 2, then num_chunks is the largest prime factor of num_dists that is less than ceil

  • If n is a positive integer, num_chunks equals n.

Finally, if the number of chunks as chosen above does not divide num_dists, then this function bites off a single district from the graph and recursively partitions the remaining graph into num_dists - 1 districts.

Parameters
  • graph (nx.Graph) – The underlying graph structure.

  • num_dists (int) – number of districts to partition the graph into

  • pop_target (Union[float, int]) – Target population for each part of the partition

  • pop_col (str) – Node attribute key holding population data

  • epsilon (float) – How far (as a percentage of pop_target) from pop_target the parts of the partition can be

  • method (Callable, optional) – Function used to find balanced partitions at the 2-district level. Defaults to bipartition_tree()

  • node_repeats (int, optional) – Parameter for bipartition_tree() to use. Defaults to 1.

  • n (Optional[int], optional) – Either a positive integer (greater than 1) or None. If n is a positive integer, this function will recursively create a seed plan by either biting off districts from graph or dividing graph into n chunks and recursing into each of these. If n is None, this function prime factors ``num_dists``=n_1*n_2*…*n_k (n_1 > n_2 > … n_k) and recursively partitions graph into n_1 chunks. Defaults to None.

  • ceil (Optional[int], optional) – Either a positive integer (at least 2) or None. Relevant only if n is None. If ceil is a positive integer then finds the largest factor of num_dists less than or equal to ceil, and recursively splits graph into that number of chunks, or bites off a district if that number is 1. Defaults to None.

Returns

New assignments for the nodes of graph.

Return type

List of lists, each list is a district

gerrychain.tree.recursive_tree_part(graph: ~networkx.classes.graph.Graph, parts: ~typing.Sequence, pop_target: ~typing.Union[float, int], pop_col: str, epsilon: float, node_repeats: int = 1, method: ~typing.Callable = functools.partial(<function bipartition_tree>, max_attempts=10000)) Dict[source]

Uses bipartition_tree() recursively to partition a tree into len(parts) parts of population pop_target (within epsilon). Can be used to generate initial seed plans or to implement ReCom-like “merge walk” proposals.

Parameters
  • graph (nx.Graph) – The graph

  • parts (Sequence) – Iterable of part labels (like [0,1,2] or range(4))

  • pop_target (Union[float, int]) – Target population for each part of the partition

  • pop_col (str) – Node attribute key holding population data

  • epsilon (float) – How far (as a percentage of pop_target) from pop_target the parts of the partition can be

  • node_repeats (int, optional) – Parameter for bipartition_tree() to use. Defaluts to 1.

  • method (Callable, optional) – The partition method to use. Defaults to partial(bipartition_tree, max_attempts=10000).

Returns

New assignments for the nodes of graph.

Return type

dict

gerrychain.tree.successors(h: Graph, root: Any) Dict[source]
gerrychain.tree.uniform_spanning_tree(graph: ~networkx.classes.graph.Graph, choice: ~typing.Callable = <bound method Random.choice of <random.Random object>>) Graph[source]

Builds a spanning tree chosen uniformly from the space of all spanning trees of the graph. Uses Wilson’s algorithm.

Parameters
  • graph (nx.Graph) – Networkx Graph

  • choice (Callable, optional) – random.choice(). Defaults to random.choice().

Returns

A spanning tree of the graph chosen uniformly at random.

Return type

nx.Graph

Constraints

class gerrychain.constraints.bounds.Bounds(func: Callable, bounds: Tuple[float, float])[source]

Bases: object

Wrapper for numeric-validators to enforce upper and lower limits.

This class is meant to be called as a function after instantiation; its return is True if the numeric validator is within set limits, and False otherwise.

Parameters
  • func (Callable) – Numeric validator function. Should return an iterable of values.

  • bounds (Tuple[float, float]) – Tuple of (lower, upper) numeric bounds.

class gerrychain.constraints.bounds.LowerBound(func: Callable, bound: float)[source]

Bases: object

Wrapper for numeric-validators to enforce lower limits.

This class is meant to be called as a function after instantiation; its return is True if the numeric validator is within a set lower limit, and False otherwise.

Parameters
  • func (Callable) – Numeric validator function. Should return a comparable value.

  • bounds (float) – Comparable lower bound.

class gerrychain.constraints.bounds.SelfConfiguringLowerBound(func: Callable, epsilon: float = 0.05)[source]

Bases: object

Wrapper for numeric-validators to enforce automatic lower limits.

When instantiated, the initial lower bound is set as the initial value of the numeric-validator minus some configurable ε.

This class is meant to be called as a function after instantiation; its return is True if the numeric validator is within a set lower limit, and False otherwise.

Parameters
  • func (Callable) – Numeric validator function.

  • epsilon (float, optional) – Initial population deviation allowable by the validator as a percentage of the ideal population. Defaults to 0.05.

class gerrychain.constraints.bounds.SelfConfiguringUpperBound(func: Callable)[source]

Bases: object

Wrapper for numeric-validators to enforce automatic upper limits.

When instantiated, the initial upper bound is set as the initial value of the numeric-validator.

This class is meant to be called as a function after instantiation; its return is True if the numeric validator is within a set upper limit, and False otherwise.

Parameters

func (Callable) – Numeric validator function.

class gerrychain.constraints.bounds.UpperBound(func: Callable, bound: float)[source]

Bases: object

Wrapper for numeric-validators to enforce upper limits.

This class is meant to be called as a function after instantiation; its return is True if the numeric validator is within a set upper limit, and False otherwise.

Parameters
  • func (Callable) – Numeric validator function. Should return a comparable value.

  • bounds (float) – Comparable upper bound.

class gerrychain.constraints.bounds.WithinPercentRangeOfBounds(func: Callable, percent: float)[source]

Bases: object

Wrapper for numeric-validators to enforce upper and lower limits determined by a percentage of the initial value.

When instantiated, the initial upper and lower bounds are set as the initial value of the numeric-validator times (1 ± percent).

This class is meant to be called as a function after instantiation; its return is True if the numeric validator is within the desired percentage range of the initial value, and False otherwise.

Parameters
  • func (Callable) – Numeric validator function.

  • percent (float) – Percentage of the initial value to use as the bounds.

Returns

None

Warning

The percentage is assumed to be in the range [0.0, 100.0].

gerrychain.constraints.compactness.L1_polsby_popper(partition: Partition) float[source]

Returns the \(L^1\) norm of the Polsby-Popper scores for the given partition

Parameters

partition (Partition) – Partition representing a districting plan

Returns

\(L^1\) norm of the reciprocal Polsby-Popper scores

Return type

float

gerrychain.constraints.compactness.L1_reciprocal_polsby_popper(partition: Partition) float[source]

Returns the \(L^1\) norm of the reciprocal Polsby-Popper scores for the given partition

Parameters

partition (Partition) – Partition representing a districting plan

Returns

\(L^1\) norm of the reciprocal Polsby-Popper scores

Return type

float

gerrychain.constraints.compactness.L2_polsby_popper(partition: Partition) float[source]

Returns the \(L^2\) norm of the Polsby-Popper scores for the given partition.

Parameters

partition (Partition) – Partition representing a districting plan

Returns

\(L^2\) norm of the Polsby-Popper scores

Return type

float

gerrychain.constraints.compactness.L_minus_1_polsby_popper(partition)[source]

Returns the \(L^{-1}\) norm of the Polsby-Popper scores for the given partition.

Parameters

partition (Partition) – Partition representing a districting plan

Returns

\(L^{-1}\) norm of the Polsby-Popper scores

Return type

float

gerrychain.constraints.contiguity.affected_parts(partition: Partition) Set[int][source]

Checks which partitions were affected by the change of nodes.

Parameters

partition (Partition) – The proposed next Partition

Returns

The set of IDs of all parts that gained or lost a node when compared to the parent partition.

Return type

Set[int]

gerrychain.constraints.contiguity.are_reachable(G: Graph, source: Any, avoid: Callable, targets: Any) bool[source]

A modified version of NetworkX’s function networkx.algorithms.shortest_paths.weighted._dijkstra_multisource()

This function checks if the targets are reachable from the source node while avoiding edges based on the avoid condition function.

Parameters
  • G (nx.Graph) – The networkx graph

  • source (int) – The starting node

  • avoid (Callable) – The function that determines if an edge should be avoided. It should take in three parameters: the start node, the end node, and the edges to avoid. It should return True if the edge should be avoided, False otherwise.

  • targets (Any) – The target nodes that we would like to reach

Returns

True if all of the targets are reachable from the source node under the avoid condition, False otherwise.

Return type

bool

gerrychain.constraints.contiguity.contiguous(partition: Partition) bool[source]

Check if the parts of a partition are connected using networkx.is_connected().

Parameters

partition (Partition) – The proposed next Partition

Returns

Whether the partition is contiguous

Return type

bool

gerrychain.constraints.contiguity.contiguous_bfs(partition: Partition) bool[source]

Checks that a given partition’s parts are connected as graphs using a simple breadth-first search.

Parameters

partition (Partition) – Instance of Partition

Returns

Whether the parts of this partition are connected

Return type

bool

gerrychain.constraints.contiguity.contiguous_components(partition: Partition) Dict[int, list][source]

Return the connected components of each of the subgraphs of the parts of the partition.

Parameters

partition (Partition) – Instance of Partition; contains connected components.

Returns

dictionary mapping each part ID to a list holding the connected subgraphs of that part of the partition

Return type

dict

gerrychain.constraints.contiguity.number_of_contiguous_parts(partition: Partition) int[source]
Parameters

partition (Partition) – Instance of Partition; contains connected components.

Returns

Number of contiguous parts in the partition.

Return type

int

gerrychain.constraints.contiguity.single_flip_contiguous(partition: Partition) bool[source]

Check if swapping the given node from its old assignment disconnects the old assignment class.

Parameters

partition (Partition) – The proposed next Partition

Returns

whether the partition is contiguous

Return type

bool

We assume that removed_node belonged to an assignment class that formed a connected subgraph. To see if its removal left the subgraph connected, we check that the neighbors of the removed node are still connected through the changed graph.

class gerrychain.constraints.validity.Validator(constraints: List[Callable])[source]

Bases: object

A single callable for checking that a partition passes a collection of constraints. Intended to be passed as the is_valid parameter when instantiating MarkovChain.

This class is meant to be called as a function after instantiation; its return is True if all validators pass, and False if any one fails.

Example usage:

is_valid = Validator([constraint1, constraint2, constraint3])
chain = MarkovChain(proposal, is_valid, accept, initial_state, total_steps)
Variables

constraints – List of validator functions that will check partitions.

Parameters

constraints (List[Callable]) – List of validator functions that will check partitions.

gerrychain.constraints.validity.deviation_from_ideal(partition: Partition, attribute: str = 'population') Dict[int, float][source]

Computes the deviation of the given attribute from exact equality among parts of the partition. Usually attribute is the population, and this function is used to compute how far a districting plan is from exact population equality.

By “deviation” we mean (actual_value - ideal)/ideal (not the absolute value).

Parameters
  • partition (Partition) – A partition.

  • attribute (str, optional) – The Tally to compute deviation for. Default is "population".

Returns

dictionary from parts to their deviation

Return type

Dict[int, float]

gerrychain.constraints.validity.districts_within_tolerance(partition: Partition, attribute_name: str = 'population', percentage: float = 0.1) bool[source]

Check if all districts are within a certain percentage of the “smallest” district, as defined by the given attribute.

Parameters
  • partition (Partition) – Partition class instance

  • attrName (str, optional) – String that is the name of an updater in partition. Default is "population".

  • percentage (float, optional) – What percent (as a number between 0 and 1) difference is allowed. Default is 0.1.

Returns

Whether the districts are within specified tolerance

Return type

bool

gerrychain.constraints.validity.no_vanishing_districts(partition: Partition) bool[source]

Require that no districts be completely consumed.

Parameters

partition (Partition) – Partition to check.

Returns

Whether no districts are completely consumed.

Return type

bool

gerrychain.constraints.validity.refuse_new_splits(partition_county_field: str) Callable[[Partition], bool][source]

Refuse all proposals that split a county that was previous unsplit.

Parameters

partition_county_field (str) – Name of field for county information generated by county_splits().

Returns

Function that returns True if the proposal does not split any new counties.

Return type

Callable[[Partition], bool]

gerrychain.constraints.validity.within_percent_of_ideal_population(initial_partition: Partition, percent: float = 0.01, pop_key: str = 'population') Bounds[source]

Require that all districts are within a certain percent of “ideal” (i.e., uniform) population.

Ideal population is defined as “total population / number of districts.”

Parameters
  • initial_partition (Partition) – Starting partition from which to compute district information.

  • percent (float, optional) – Allowed percentage deviation. Default is 1%.

  • pop_key (str, optional) – The name of the population Tally. Default is "population".

Returns

A Bounds constraint on the population attribute identified by pop_key.

Return type

Bounds

Graph

This module provides a set of functions to help determine and manipulate the adjacency of geometries within a particular graph. The functions in this module are used internally to ensure that the geometry data that we are working with is sufficiently well-defined to be used for analysis.

Some of the type hints in this module are intentionally left unspecified because of import issues.

gerrychain.graph.adjacency.intersections_with_neighbors(geometries)[source]

Generator yielding tuples of the form (id, {neighbor_id: intersection}). The intersections may be empty!

Parameters

geometries (shapely.geometry.BaseGeometry) – A Shapeley geometry object.

Returns

A generator yielding tuples of the form (id, {neighbor_id: intersection})

Return type

Generator

gerrychain.graph.adjacency.neighboring_geometries(geometries, tree=None)[source]

Generator yielding tuples of the form (id, (ids of neighbors)).

Parameters
  • geometries (shapely.geometry.BaseGeometry) – A Shapeley geometry object to construct the tree from

  • tree (shapely.strtree.STRtree, optional) – A Sort-Tile-Recursive tree for spatial indexing. Default is None.

Returns

A generator yielding tuples of the form (id, (ids of neighbors))

Return type

Generator

gerrychain.graph.adjacency.neighbors(df: <Mock name='mock.GeoDataFrame' id='139929996772688'>, adjacency: str) Dict[source]
gerrychain.graph.adjacency.queen(geometries)[source]
Parameters

geometries (shapely.geometry.BaseGeometry) – A Shapeley geometry object.

Returns

The queen adjacency dictionary for the given collection of polygons.

Return type

Dict

gerrychain.graph.adjacency.rook(geometries)[source]
Parameters

geometries (shapely.geometry.BaseGeometry) – A Shapeley geometry object.

Returns

The rook adjacency dictionary for the given collection of polygons.

Return type

Dict

gerrychain.graph.adjacency.str_tree(geometries)[source]

Add ids to geometries and create a STR tree for spatial indexing. Use this for all spatial operations!

Parameters

geometries (shapely.geometry.BaseGeometry) – A Shapely geometry object to construct the tree from.

Returns

A Sort-Tile-Recursive tree for spatial indexing.

Return type

shapely.strtree.STRtree

gerrychain.graph.adjacency.warn_for_overlaps(intersection_pairs)[source]
Parameters

intersection_pairs (Iterable) – An iterable of tuples of the form (id, {neighbor_id: intersection})

Returns

A generator yielding tuples of intersection pairs

Return type

Generator

Raises

UserWarning if there are overlaps among the given polygons

This module contains functions for working with GeoDataFrames and Shapely geometries. Specifically, it contains functions for handling and reprojecting the Universal Transverse Mercator projection, and for identifying bad geometries within a given GeoDataFrame.

exception gerrychain.graph.geo.GeometryError[source]

Bases: Exception

Wrapper error class for projection failures. Changing a map’s projection may create invalid geometries, which may or may not be repairable using the .buffer(0) trick.

gerrychain.graph.geo.explain_validity(geo) str[source]

Given a geometry that is shapely interpretable, explain the validity. Light wrapper around shapely’s explain_validity.

Parameters

geo (shapely.geometry.BaseGeometry) – Shapely geometry object

Returns

Explanation for the validity of the geometry

Return type

str

gerrychain.graph.geo.identify_utm_zone(df: <Mock name='mock.GeoDataFrame' id='139929996772688'>) int[source]

Given a GeoDataFrame, identify the Universal Transverse Mercator zone number for the centroid of the geometries in the dataframe.

gerrychain.graph.geo.invalid_geometries(df)[source]

Given a GeoDataFrame, returns a list of row indices with invalid geometries.

Parameters

df (geopandas.GeoDataFrame) – The GeoDataFrame to examine

Returns

List of row indices with invalid geometries

Return type

list of int

gerrychain.graph.geo.reprojected(df)[source]
Returns a copy of df, projected into the coordinate reference system of a suitable

Universal Transverse Mercator zone.

Parameters

df (geopandas.GeoDataFrame) – The GeoDataFrame to reproject

Returns

A copy of df, projected into the coordinate reference system of a suitable UTM zone.

Return type

geopandas.GeoDataFrame

gerrychain.graph.geo.utm_of_point(point)[source]

Given a point, return the Universal Transverse Mercator zone number for that point.

This module provides tools for working with graphs in the context of geographic data. It extends the functionality of the NetworkX library, adding support for spatial data structures, geographic projections, and serialization to and from JSON format.

This module is designed to be used in conjunction with geopandas, shapely, and pandas libraries, facilitating the integration of graph-based algorithms with geographic information systems (GIS).

Note: This module relies on NetworkX, pandas, and geopandas, which should be installed and imported as required.

class gerrychain.graph.graph.FrozenGraph(graph: Graph)[source]

Bases: object

Represents an immutable graph to be partitioned. It is based off Graph.

This speeds up chain runs and prevents having to deal with cache invalidation issues. This class behaves slightly differently than Graph or networkx.Graph.

Not intended to be a part of the public API.

Variables
  • graph – The underlying graph.

  • size – The number of nodes in the graph.

Note

The class uses __slots__ for improved memory efficiency.

Initialize a FrozenGraph from a Graph.

param graph

The mutable Graph to be converted into an immutable graph

type graph

Graph

returns

None

degree(n: Any) int[source]
property edge_indices: Iterable[Any]
graph
lookup(node: Any, field: str) Any[source]
neighbors(n: Any) Tuple[Any, ...][source]
property node_indices: Iterable[Any]
size
subgraph(nodes: Iterable[Any]) FrozenGraph[source]
class gerrychain.graph.graph.Graph(incoming_graph_data=None, **attr)[source]

Bases: Graph

Represents a graph to be partitioned, extending the networkx.Graph.

This class includes additional class methods for constructing graphs from shapefiles, and for saving and loading graphs in JSON format.

Initialize a graph with edges, name, or graph attributes.

Parameters

incoming_graph_datainput graph (optional, default: None)

Data to initialize graph. If None (default) an empty graph is created. The data can be an edge list, or any NetworkX graph object. If the corresponding optional Python packages are installed the data can also be a 2D NumPy array, a SciPy sparse matrix, or a PyGraphviz graph.

attrkeyword arguments, optional (default= no attributes)

Attributes to add to graph as key=value pairs.

See Also

convert

Examples

>>> G = nx.Graph()  # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> G = nx.Graph(name="my graph")
>>> e = [(1, 2), (2, 3), (3, 4)]  # list of edges
>>> G = nx.Graph(e)

Arbitrary graph attribute pairs (key=value) may be assigned

>>> G = nx.Graph(e, day="Friday")
>>> G.graph
{'day': 'Friday'}
add_data(df: <Mock name='mock.DataFrame' id='139929998521008'>, columns: ~typing.Optional[~typing.Iterable[str]] = None) None[source]

Add columns of a DataFrame to a graph as node attributes by matching the DataFrame’s index to node ids.

Parameters
  • df (pandas.DataFrame) – Dataframe containing given columns.

  • columns (Optional[Iterable[str]], optional) – List of dataframe column names to add. Default is None.

Returns

None

property edge_indices
classmethod from_file(filename: str, adjacency: str = 'rook', cols_to_add: Optional[List[str]] = None, reproject: bool = False, ignore_errors: bool = False) Graph[source]

Create a Graph from a shapefile (or GeoPackage, or GeoJSON, or any other library that geopandas can read. See from_geodataframe() for more details.

Parameters
  • filename (str) – Path to the shapefile / GeoPackage / GeoJSON / etc.

  • adjacency (str, optional) – The adjacency type to use (“rook” or “queen”). Default is “rook”

  • cols_to_add (Optional[List[str]], optional) – The names of the columns that you want to add to the graph as node attributes. Default is None.

  • reproject (bool, optional) – Whether to reproject to a UTM projection before creating the graph. Default is False.

  • ignore_errors (bool, optional) – Whether to ignore all invalid geometries and try to continue creating the graph. Default is False.

Returns

The Graph object of the geometries from filename.

Return type

Graph

Warning

This method requires the optional geopandas dependency. So please install gerrychain with the geo extra via the command:

pip install gerrychain[geo]

or install geopandas separately.

classmethod from_geodataframe(dataframe: <Mock name='mock.DataFrame' id='139929998521008'>, adjacency: str = 'rook', cols_to_add: ~typing.Optional[~typing.List[str]] = None, reproject: bool = False, ignore_errors: bool = False) Graph[source]

Creates the adjacency Graph of geometries described by dataframe. The areas of the polygons are included as node attributes (with key area). The shared perimeter of neighboring polygons are included as edge attributes (with key shared_perim). Nodes corresponding to polygons on the boundary of the union of all the geometries (e.g., the state, if your dataframe describes VTDs) have a boundary_node attribute (set to True) and a boundary_perim attribute with the length of this “exterior” boundary.

By default, areas and lengths are computed in a UTM projection suitable for the geometries. This prevents the bizarro area and perimeter values that show up when you accidentally do computations in Longitude-Latitude coordinates. If the user specifies reproject=False, then the areas and lengths will be computed in the GeoDataFrame’s current coordinate reference system. This option is for users who have a preferred CRS they would like to use.

Parameters
  • dataframe (geopandas.GeoDataFrame) – The GeoDateFrame to convert

  • adjacency (str, optional) – The adjacency type to use (“rook” or “queen”). Default is “rook”.

  • cols_to_add (Optional[List[str]], optional) – The names of the columns that you want to add to the graph as node attributes. Default is None.

  • reproject (bool, optional) – Whether to reproject to a UTM projection before creating the graph. Default is False.

  • ignore_errors (bool, optional) – Whether to ignore all invalid geometries and attept to create the graph anyway. Default is False.

Returns

The adjacency graph of the geometries from dataframe.

Return type

Graph

classmethod from_json(json_file: str) Graph[source]

Load a graph from a JSON file in the NetworkX json_graph format.

Parameters

json_file (str) – Path to JSON file.

Returns

The loaded graph as an instance of this class.

Return type

Graph

classmethod from_networkx(graph: Graph) Graph[source]

Create a Graph instance from a networkx.Graph object.

Parameters

graph (networkx.Graph) – The networkx graph to be converted.

Returns

The converted graph as an instance of this class.

Return type

Graph

property islands: Set
Returns

The set of degree-0 nodes.

Return type

Set

issue_warnings() None[source]
Returns

None

Raises

UserWarning if the graph has any red flags (right now, only islands).

join(dataframe: <Mock name='mock.DataFrame' id='139929998521008'>, columns: ~typing.Optional[~typing.List[str]] = None, left_index: ~typing.Optional[str] = None, right_index: ~typing.Optional[str] = None) None[source]

Add data from a dataframe to the graph, matching nodes to rows when the node’s left_index attribute equals the row’s right_index value.

Parameters

dataframe (pandas.DataFrame) – DataFrame.

Columns

The columns whose data you wish to add to the graph. If not provided, all columns are added. Default is None.

Left_index

The node attribute used to match nodes to rows. If not provided, node IDs are used. Default is None.

Right_index

The DataFrame column name to use to match rows to nodes. If not provided, the DataFrame’s index is used. Default is None.

Returns

None

lookup(node: Any, field: Any) Any[source]

Lookup a node/field attribute.

Parameters
  • node (Any) – Node to look up.

  • field (Any) – Field to look up.

Returns

The value of the attribute field at node.

Return type

Any

property node_indices
to_json(json_file: str, *, include_geometries_as_geojson: bool = False) None[source]

Save a graph to a JSON file in the NetworkX json_graph format.

Parameters
  • json_file (str) – Path to target JSON file.

  • include_geometry_as_geojson (bool) – Whether to include any shapely geometry objects encountered in the graph’s node attributes as GeoJSON. The default (False) behavior is to remove all geometry objects because they are not serializable. Including the GeoJSON will result in a much larger JSON file.

Returns

None

warn_for_islands() None[source]
Returns

None

Raises

UserWarning if the graph has any islands (degree-0 nodes).

gerrychain.graph.graph.add_boundary_perimeters(graph: ~gerrychain.graph.graph.Graph, geometries: <Mock name='mock.Series' id='139929998520528'>) None[source]

Add shared perimeter between nodes and the total geometry boundary.

Parameters
  • graph (Graph) – NetworkX graph

  • geometries (pandas.Series) – geopandas.GeoSeries containing geometry information.

Returns

The updated graph.

Return type

Graph

gerrychain.graph.graph.check_dataframe(df: <Mock name='mock.DataFrame' id='139929998521008'>) None[source]
Returns

None

Raises

UserWarning if the dataframe has any NA values.

gerrychain.graph.graph.convert_geometries_to_geojson(data: Graph) None[source]

Convert geometry attributes in a NetworkX adjacency data object to GeoJSON, so that they can be serialized. Mutates the data object.

Does nothing if no geometry attributes are found.

Parameters

data (networkx.Graph) – an adjacency data object (returned by networkx.readwrite.json_graph.adjacency_data())

Returns

None

gerrychain.graph.graph.json_serialize(input_object: Any) Optional[int][source]

This function is used to handle one of the common issues that appears when trying to convert a pandas dataframe into a JSON serializable object. Specifically, it handles the issue of converting the pandas int64 to a python int so that JSON can serialize it. This is specifically used so that we can write graphs out to JSON files.

Parameters

input_object (Any (expected to be a pd.Int64Dtype)) – The object to be converted

Returns

The converted pandas object or None if input is not of type pd.Int64Dtype

Return type

Optional[int]

gerrychain.graph.graph.remove_geometries(data: Graph) None[source]

Remove geometry attributes from NetworkX adjacency data object, because they are not serializable. Mutates the data object.

Does nothing if no geometry attributes are found.

Parameters

data (networkx.Graph) – an adjacency data object (returned by networkx.readwrite.json_graph.adjacency_data())

Returns

None

Meta

Simple tooling to collect diversity stats on chain runs

class gerrychain.meta.diversity.DiversityStats(unique_plans: int, unique_districts: int, steps_taken: int)[source]

Bases: object

Lightweight stats object that reports the diversity of a given chain.

Variables
  • unique_plans – The number of unique plans seen so far.

  • unique_districts – The number of unique districts seen so far.

  • steps_taken – The number of steps taken so far.

Example usage:

DiversityStats(unique_plans=44162, unique_districts=82992, steps_taken=100000)
steps_taken: int
unique_districts: int
unique_plans: int
gerrychain.meta.diversity.collect_diversity_stats(chain: Iterable[Partition]) Iterable[Tuple[Partition, DiversityStats]][source]

Report the diversity of the chain being run, live, as a drop-in wrapper. Requires the cut_edges updater on each Partition object. Plans/districts are considered distinct if they are not isomorphic. That is, relabled plans and districts are considered non-unique and counted as duplicate.

Example usage:

for partition, stats in collect_diversity_stats(
    Replay(
        graph,
        "sample-run.chain"
        )
):
    print(stats)
    # normal chain stuff here
Parameters

chain (Iterable[Partition]) – A chain object to collect stats on.

Returns

An iterable of (partition, DiversityStat).

Return type

Iterable[Tuple[Partition, DiversityStats]]

Metrics

gerrychain.metrics.compactness.compute_polsby_popper(area: float, perimeter: float) float[source]

Computes the Polsby-Popper score for a single district.

Parameters
  • area (float) – The area of the district

  • perimeter (float) – The perimeter of the district

Returns

The Polsby-Popper score for the district

Return type

float

gerrychain.metrics.compactness.polsby_popper(partition) Dict[int, float][source]

Computes Polsby-Popper compactness scores for each district in the partition.

Parameters

partition (Partition) – The partition to compute scores for

Returns

A dictionary mapping each district ID to its Polsby-Popper score

Return type

Dict[int, float]

The partisan metrics in this file are later used in the module gerrychain.updaters.election.py. Thus, all of the election results objects here are implicilty typed as ElectionResults, but cannot be given an explicit type annotation due to problems with circular imports.

gerrychain.metrics.partisan.efficiency_gap(election_results) float[source]

Computes the efficiency gap for the given ElectionResults. A positive value indicates an advantage for the first party listed in the Election’s parties_to_columns dictionary.

Parameters

election_results (ElectionResults) – An ElectionResults object

Returns

The efficiency gap for the given ElectionResults

Return type

float

gerrychain.metrics.partisan.mean_median(election_results) float[source]

Computes the Mean-Median score for the given ElectionResults. A positive value indicates an advantage for the first party listed in the Election’s parties_to_columns dictionary.

Parameters

election_results (ElectionResults) – An ElectionResults object

Returns

The Mean-Median score for the given ElectionResults

Return type

float

gerrychain.metrics.partisan.mean_thirdian(election_results) float[source]

Computes the Mean-Median score for the given ElectionResults. A positive value indicates an advantage for the first party listed in the Election’s parties_to_columns dictionary.

The motivation for this score is that the minority party in many states struggles to win even a third of the seats.

Parameters

election_results (ElectionResults) – An ElectionResults object

Returns

The Mean-Thirdian score for the given ElectionResults

Return type

float

gerrychain.metrics.partisan.partisan_bias(election_results) float[source]

Computes the partisan bias for the given ElectionResults. The partisan bias is defined as the number of districts with above-mean vote share by the first party divided by the total number of districts, minus 1/2.

Parameters

election_results (ElectionResults) – An ElectionResults object

Returns

The partisan bias for the given ElectionResults

Return type

float

gerrychain.metrics.partisan.partisan_gini(election_results) float[source]

Computes the partisan Gini score for the given ElectionResults. The partisan Gini score is defined as the area between the seats-votes curve and its reflection about (.5, .5).

For more information on the computation, see Definition 1 in: https://arxiv.org/pdf/2008.06930.pdf

Parameters

election_results (ElectionResults) – An ElectionResults object

Returns

The partisan Gini score for the given ElectionResults

Return type

float

gerrychain.metrics.partisan.wasted_votes(party1_votes: int, party2_votes: int) Tuple[int, int][source]

Computes the wasted votes for each party in the given race. :param party1_votes: the number of votes party1 received in the race :type party1_votes: int :param party2_votes: the number of votes party2 received in the race :type party2_votes: int

Returns

a tuple of the wasted votes for each party

Return type

Tuple[int, int]

Partition

class gerrychain.partition.assignment.Assignment(parts: Dict, mapping: Optional[Dict] = None, validate: bool = True)[source]

Bases: Mapping

An assignment of nodes into parts.

The goal of Assignment is to provide an interface that mirrors a dictionary (what we have been using for assigning nodes to districts) while making it convenient/cheap to access the set of nodes in each part.

An Assignment has a parts property that is a dictionary of the form {part: <frozenset of nodes in part>}.

Parameters
  • parts (Dict) – Dictionary mapping partition assignments frozensets of nodes.

  • mapping (Optional[Dict], optional) – Dictionary mapping nodes to partition assignments. Default is None.

  • validate (bool, optional) – Whether to validate the assignment. Default is True.

Returns

None

Raises
  • ValueError – if the keys of parts are not unique

  • TypeError – if the values of parts are not frozensets

copy()[source]

Returns a copy of the assignment. Does not duplicate the frozensets of nodes, just the parts dictionary.

classmethod from_dict(assignment: Dict) Assignment[source]

Create an Assignment from a dictionary. This is probably the method you want to use to create a new assignment.

This also works for pandas.Series.

Parameters

assignment (Dict) – dictionary mapping nodes to partition assignments

Returns

A new instance of Assignment with the same assignments as the passed-in dictionary.

Return type

Assignment

items()[source]

Iterate over (node, part) tuples, where node is assigned to part.

keys()[source]
mapping
parts
to_dict() Dict[source]
Returns

The assignment as a {node: part} dictionary.

Return type

Dict

to_series() <Mock name='mock.Series' id='139929998520528'>[source]
Returns

The assignment as a pandas.Series.

Return type

pandas.Series

update_flows(flows)[source]

Update the assignment for some nodes using the given flows.

update_parts(new_parts: Dict) None[source]

Update some parts of the assignment. Does not check that every node is still assigned to a part.

Parameters

new_parts (Dict) – dictionary mapping (some) parts to their new sets or frozensets of nodes

Returns

None

values()[source]
gerrychain.partition.assignment.get_assignment(part_assignment: Union[str, Dict, Assignment], graph: Optional[Graph] = None) Assignment[source]

Either extracts an Assignment object from the input graph using the provided key or attempts to convert part_assignment into an Assignment object.

Parameters
  • part_assignment (str) – A node attribute key, dictionary, or Assignment object corresponding to the desired assignment.

  • graph (Optional[Graph], optional) – The graph from which to extract the assignment. Default is None.

Returns

An Assignment object containing the assignment corresponding to the part_assignment input

Return type

Assignment

Raises
  • TypeError – If the part_assignment is a string and the graph is not provided.

  • TypeError – If the part_assignment is not a string or dictionary.

gerrychain.partition.assignment.level_sets(mapping: ~typing.Dict, container: ~typing.Type[~typing.Set] = <class 'set'>) DefaultDict[source]

Inverts a dictionary. {key: value} becomes {value: <container of keys that map to value>}.

Parameters
  • mapping (Dict) – A dictionary to invert. Keys and values can be of any type.

  • container (Type[Set], optional) – A container type used to collect keys that map to the same value. By default, the container type is set.

Returns

A dictionary where each key is a value from the original dictionary, and the corresponding value is a container (by default, a set) of keys from the original dictionary that mapped to this value.

Return type

DefaultDict

Example usage:

.. code_block:: python
>>> level_sets({'a': 1, 'b': 1, 'c': 2})
defaultdict(<class 'set'>, {1: {'a', 'b'}, 2: {'c'}})
class gerrychain.partition.geographic.GeographicPartition(graph=None, assignment=None, updaters=None, parent=None, flips=None, use_default_updaters=True)[source]

Bases: Partition

A Partition with areas, perimeters, and boundary information included. These additional data allow you to compute compactness scores like Polsby-Popper.

Parameters
  • graph – Underlying graph.

  • assignment – Dictionary assigning nodes to districts.

  • updaters – Dictionary of functions to track data about the partition. The keys are stored as attributes on the partition class, which the functions compute.

  • use_default_updaters – If False, do not include default updaters.

assignment
default_updaters = {'area': <gerrychain.updaters.tally.Tally object>, 'boundary_nodes': <function boundary_nodes>, 'cut_edges': <function cut_edges>, 'cut_edges_by_part': <function cut_edges_by_part>, 'exterior_boundaries': <function exterior_boundaries>, 'interior_boundaries': <function interior_boundaries>, 'perimeter': <function perimeter>}
edge_flows
flips
flows
graph
parent
subgraphs
updaters
class gerrychain.partition.partition.Partition(graph=None, assignment=None, updaters=None, parent=None, flips=None, use_default_updaters=True)[source]

Bases: object

Partition represents a partition of the nodes of the graph. It will perform the first layer of computations at each step in the Markov chain - basic aggregations and calculations that we want to optimize.

Variables
  • graph – The underlying graph.

  • assignment – Maps node IDs to district IDs.

  • parts – Maps district IDs to the set of nodes in that district.

  • subgraphs – Maps district IDs to the induced subgraph of that district.

Parameters
  • graph (Graph) – Underlying graph.

  • assignment (Assignment) – Dictionary assigning nodes to districts.

  • updaters – Dictionary of functions to track data about the partition. The keys are stored as attributes on the partition class, which the functions compute.

  • use_default_updaters – If False, do not include default updaters.

assignment
crosses_parts(edge: Tuple) bool[source]
Parameters

edge (Tuple) – tuple of node IDs

Returns

True if the edge crosses from one part of the partition to another

Return type

bool

default_updaters = {'cut_edges': <function cut_edges>}
edge_flows
flip(flips: Dict) Partition[source]

Returns the new partition obtained by performing the given flips on this partition.

Parameters

flips – dictionary assigning nodes of the graph to their new districts

Returns

the new Partition

Return type

Partition

flips
flows
classmethod from_districtr_file(graph: Graph, districtr_file: str, updaters: Optional[Dict[str, Callable]] = None) Partition[source]

Create a Partition from a districting plan created with Districtr, a free and open-source web app created by MGGG for drawing districts.

The provided graph should be created from the same shapefile as the Districtr module used to draw the districting plan. These shapefiles may be found in a repository in the mggg-states GitHub organization, or by request from MGGG.

Parameters
  • graph (Graph) – The graph to create the Partition from

  • districtr_file (str) – the path to the .json file exported from Districtr

  • updaters (Optional[Dict[str, Callable]], optional) – dictionary of updaters

Returns

The partition created from the Districtr file

Return type

Partition

graph
keys()[source]
parent
property parts
plot(geometries=None, **kwargs)[source]

Plot the partition, using the provided geometries.

Parameters
  • geometries (geopandas.GeoDataFrame or geopandas.GeoSeries) – A geopandas.GeoDataFrame or geopandas.GeoSeries holding the geometries to use for plotting. Its Index should match the node labels of the partition’s underlying Graph.

  • **kwargs – Additional arguments to pass to geopandas.GeoDataFrame.plot() to adjust the plot.

Returns

The matplotlib axes object. Which plots the Partition.

Return type

matplotlib.axes.Axes

subgraphs
updaters
class gerrychain.partition.subgraphs.SubgraphView(graph: Graph, parts: List[List[Any]])[source]

Bases: object

A view for accessing subgraphs of Graph objects.

This class makes use of a subgraph cache to avoid recomputing subgraphs which can speed up computations when working with district assignments within a partition class.

Variables
  • graph – The parent graph from which subgraphs are derived.

  • parts – A list-of-lists dictionary (so a dict with key values indicated by the list index) mapping keys to subsets of nodes in the graph.

  • subgraphs_cache – Cache to store subgraph views for quick access.

Parameters
  • graph (Graph) – The parent graph from which subgraphs are derived.

  • parts (List[List[Any]]) – A list of lists of nodes corresponding the different parts of the partition of the graph.

Returns

None

graph
items() Tuple[int, Graph][source]
parts
subgraphs_cache

Proposals

gerrychain.proposals.proposals.flip(partition: Partition) Partition

Proposes a random boundary flip from the partition.

Parameters

partition (Partition) – The current partition to propose a flip from.

Returns

A possible next ~gerrychain.Partition

Return type

Partition

gerrychain.proposals.proposals.propose_any_node_flip(partition: Partition) Partition[source]

Flip a random node (not necessarily on the boundary) to a random part

Parameters

partition (Partition) – The current partition to propose a flip from.

Returns

A possible next ~gerrychain.Partition

Return type

Partition

gerrychain.proposals.proposals.propose_chunk_flip(partition: Partition) Partition[source]

Chooses a random boundary node and proposes to flip it and all of its neighbors

Parameters

partition (Partition) – The current partition to propose a flip from.

Returns

A possible next ~gerrychain.Partition

Return type

Partition

gerrychain.proposals.proposals.propose_flip_every_district(partition: Partition) Partition[source]

Proposes a random boundary flip for each district in the partition.

Parameters

partition (Partition) – The current partition to propose the flips from.

Returns

A possible next ~gerrychain.Partition

Return type

Partition

gerrychain.proposals.proposals.propose_random_flip(partition: Partition) Partition[source]

Proposes a random boundary flip from the partition.

Parameters

partition (Partition) – The current partition to propose a flip from.

Returns

A possible next ~gerrychain.Partition

Return type

Partition

gerrychain.proposals.proposals.slow_reversible_propose(partition: Partition) Partition[source]

Proposes a random boundary flip from the partition in a reversible fasion by selecting uniformly from the (node, flip) pairs.

Temporary version until we make an updater for this set.

Parameters

partition (Partition) – The current partition to propose a flip from.

Returns

A possible next ~gerrychain.Partition

Return type

Partition

gerrychain.proposals.proposals.slow_reversible_propose_bi(partition: Partition) Partition[source]

Proposes a random boundary flip from the partition in a reversible fashion for bipartitions by selecting a boundary node at random and uniformly picking one of its neighboring parts. For k-partitions this is not uniform since there might be multiple parts next to a single node.

Temporary version until we make an updater for this set.

Parameters

partition (Partition) – The current partition to propose a flip from.

Returns

A possible next ~gerrychain.Partition

Return type

Partition

gerrychain.proposals.spectral_proposals.spectral_cut(graph: Graph, part_labels: Dict, weight_type: str, lap_type: str) Dict[source]

Spectral cut function.

Uses the signs of the elements in the Fiedler vector of a graph to partition into two components.

Parameters
  • graph (Graph) – The graph to be partitioned.

  • part_labels (Dict) – The current partition of the graph.

  • weight_type (str) – The type of weight to be used in the Laplacian.

  • lap_type (str) – The type of Laplacian to be used.

Returns

A dictionary assigning nodes of the graph to their new districts.

Return type

Dict

gerrychain.proposals.spectral_proposals.spectral_recom(partition: Partition, weight_type: Optional[str] = None, lap_type: str = 'normalized') Partition[source]

Spectral ReCom proposal.

Uses spectral clustering to bipartition a subgraph of the original graph formed by merging the nodes corresponding to two adjacent districts.

Example usage:

from functools import partial
from gerrychain import MarkovChain
from gerrychain.proposals import recom

# ...define constraints, accept, partition, total_steps here...


proposal = partial(
    spectral_recom, weight_type=None, lap_type="normalized"
)

chain = MarkovChain(proposal, constraints, accept, partition, total_steps)
Parameters
  • partition (Partition) – The initial partition.

  • weight_type (Optional[str], optional) – The type of weight to be used in the Laplacian. Default is None.

  • lap_type (str, optional) – The type of Laplacian to be used. Default is “normalized”.

Returns

The new partition resulting from the spectral ReCom algorithm.

Return type

Partition

exception gerrychain.proposals.tree_proposals.MetagraphError[source]

Bases: Exception

Raised when the partition we are trying to split is a low degree node in the metagraph.

class gerrychain.proposals.tree_proposals.ReCom(pop_col: str, ideal_pop: ~typing.Union[int, float], epsilon: float, method: ~typing.Callable = <function bipartition_tree_random>)[source]

Bases: object

ReCom (short for ReCombination) is a class that represents a ReCom proposal for redistricting. It is used to create new partitions by recombining existing districts while maintaining population balance.

Parameters
  • pop_col (str) – The name of the column in the partition that contains the population data.

  • ideal_pop (Union[int,float]) – The ideal population for each district.

  • epsilon (float) – The epsilon value for population deviation as a percentage of the target population.

  • method (function, optional) – The method used for bipartitioning the tree. Defaults to bipartition_tree_random.

exception gerrychain.proposals.tree_proposals.ReversibilityError(msg)[source]

Bases: Exception

Raised when the cut edge upper bound is violated.

exception gerrychain.proposals.tree_proposals.ValueWarning[source]

Bases: UserWarning

Raised whe a particular value is technically valid, but may cause issues with the algorithm.

gerrychain.proposals.tree_proposals.recom(partition: ~gerrychain.partition.partition.Partition, pop_col: str, pop_target: ~typing.Union[int, float], epsilon: float, node_repeats: int = 1, weight_dict: ~typing.Optional[~typing.Dict] = None, method: ~typing.Callable = <function bipartition_tree>) Partition[source]

ReCom (short for ReCombination) is a Markov Chain Monte Carlo (MCMC) algorithm used for redistricting. At each step of the algorithm, a pair of adjacent districts is selected at random and merged into a single district. The region is then split into two new districts by generating a spanning tree using the Kruskal/Karger algorithm and cutting an edge at random. The edge is checked to ensure that it separates the region into two new districts that are population balanced, and, if not, a new edge is selected at random and the process is repeated.

Example usage:

from functools import partial
from gerrychain import MarkovChain
from gerrychain.proposals import recom

# ...define constraints, accept, partition, total_steps here...

# Ideal population:
pop_target = sum(partition["population"].values()) / len(partition)

proposal = partial(
    recom, pop_col="POP10", pop_target=pop_target, epsilon=.05, node_repeats=10
)

chain = MarkovChain(proposal, constraints, accept, partition, total_steps)
Parameters
  • partition (Partition) – The initial partition.

  • pop_col (str) – The name of the population column.

  • pop_target (Union[int,float]) – The target population for each district.

  • epsilon (float) – The epsilon value for population deviation as a percentage of the target population.

  • node_repeats (int, optional) – The number of times to repeat the bipartitioning step. Default is 1.

  • weight_dict (Optional[Dict], optional) – The weight dictionary for the graph used for region-aware partitioning of the grid. Default is None.

  • method (Callable, optional) – The method used for bipartitioning the tree. Default is bipartition_tree().

Returns

The new partition resulting from the ReCom algorithm.

Return type

Partition

Raises

ValueWarning – Raised when the sum of the weights in the weight dictionary is greater than 1.

gerrychain.proposals.tree_proposals.reversible_recom(partition: ~gerrychain.partition.partition.Partition, pop_col: str, pop_target: ~typing.Union[int, float], epsilon: float, balance_edge_fn: ~typing.Callable = <function find_balanced_edge_cuts_memoization>, M: int = 1, repeat_until_valid: bool = False, choice: ~typing.Callable = <bound method Random.choice of <random.Random object>>) Partition[source]

Reversible ReCom algorithm for redistricting.

This function performs the reversible ReCom algorithm, which is a Markov Chain Monte Carlo (MCMC) algorithm used for redistricting. For more information, see the paper “Spanning Tree Methods for Sampling Graph Partitions” by Cannon, et al. (2022) at https://arxiv.org/abs/2210.01401

Parameters
  • partition (Partition) – The initial partition.

  • pop_col (str) – The name of the population column.

  • pop_target (Union[int,float]) – The target population for each district.

  • epsilon (float) – The epsilon value for population deviation as a percentage of the target population.

  • balance_edge_fn (Callable, optional) – The balance edge function. Default is find_balanced_edge_cuts_memoization.

  • M (int, optional) – The maximum number of balance edges. Default is 1.

  • repeat_until_valid (bool, optional) – Flag indicating whether to repeat until a valid partition is found. Default is False.

  • choice (Callable, optional) – The choice function for selecting a random element. Default is random.choice.

Returns

The new partition resulting from the reversible ReCom algorithm.

Return type

Partition

Updaters

gerrychain.updaters.compactness.boundary_nodes(partition, alias: str = 'boundary_nodes') Set[source]
Parameters
  • partition (Partition) – A partition of a Graph

  • alias (str, optional) – The name of the attribute that the boundary nodes are stored under. Default is ‘boundary_nodes’.

Returns

The set of nodes in the partition that are on the boundary.

Return type

Set

gerrychain.updaters.compactness.exterior_boundaries(partition, previous: Optional[Set], inflow: Set, outflow: Set) Dict[source]

Updater function that responds to the flow of nodes between different partitions.

Parameters
  • partition (Partition) – A partition of a Graph

  • previous (Set) – The previous set of exterior boundary nodes for a fixed part of the given partition.

  • inflow (Set) – The set of nodes that have flowed into the given part of the partition.

  • outflow (Set) – The set of nodes that have flowed out of the given part of the partition.

Returns

A dict mapping each part of the partition to the new exterior boundary of that part.

Return type

Dict

gerrychain.updaters.compactness.exterior_boundaries_as_a_set(partition, previous: Optional[Set], inflow: Set, outflow: Set) Set[source]

Updater function that responds to the flow of nodes between different partitions.

Parameters
  • partition (Partition) – A partition of a Graph

  • previous (Set) – The previous set of exterior boundary nodes for a fixed part of the given partition.

  • inflow (Set) – The set of nodes that have flowed into the given part of the partition.

  • outflow (Set) – The set of nodes that have flowed out of the given part of the partition.

Returns

The new set of exterior boundary nodes for the given part of the partition.

Return type

Set

gerrychain.updaters.compactness.flips(partition) Dict[source]
Parameters

partition (Partition) – A partition of a Graph

Returns

The flips that were made to get from the parent partition to the given partition.

Return type

Dict

gerrychain.updaters.compactness.initialize_exterior_boundaries(partition) Dict[int, float][source]
Parameters

partition (Partition) – A partition of a Graph

Returns

A dictionary mapping each part of a partition to the total perimeter of the boundary nodes in that part.

Return type

Dict[int, float]

gerrychain.updaters.compactness.initialize_exterior_boundaries_as_a_set(partition) Dict[int, Set][source]
Parameters

partition (Partition) – A partition of a Graph

Returns

A dictionary mapping each part of a partition to the set of nodes in that part that are on the boundary.

Return type

Dict[int, Set]

gerrychain.updaters.compactness.initialize_interior_boundaries(partition)[source]
Parameters

partition (Partition) – A partition of a Graph

Returns

A dictionary mapping each part of a partition to the total perimeter the given part shares with other parts.

Return type

Dict[int, float]

gerrychain.updaters.compactness.interior_boundaries(partition, previous: Set, new_edges: Set, old_edges: Set) Dict[source]

Updater function that responds to the flow of nodes between different partitions.

Parameters
  • partition (Partition) – A partition of a Graph

  • previous (Set) – The previous set of exterior boundary nodes for a fixed part of the given partition.

  • new_edges (Set) – The set of edges that have flowed into the given part of the partition.

  • old_edges (Set) – The set of edges that have flowed out of the given part of the partition.

Returns

A dict mapping each part of the partition to the new interior boundary of that part.

Return type

Dict

gerrychain.updaters.compactness.perimeter(partition) Dict[int, float][source]
Parameters

partition (Partition) – A partition of a Graph

Returns

A dictionary mapping each part of a partition to its perimeter.

Return type

Dict[int, float]

gerrychain.updaters.compactness.perimeter_of_part(partition, part: int) float[source]

Totals up the perimeter of the part in the partition.

Warning

Requires that ‘boundary_perim’ be a node attribute, ‘shared_perim’ be an edge attribute, ‘cut_edges’ be an updater, and ‘exterior_boundaries’ be an updater.

Parameters
  • partition (Partition) – A partition of a Graph

  • part (int) – The id of the part of the partition whose perimeter we want to compute.

Returns

The perimeter of the desired part.

Return type

float

class gerrychain.updaters.county_splits.CountyInfo(split, nodes, contains)

Bases: tuple

A named tuple to store county split information.

Parameters
  • split (int) – The county split status. Makes use of CountySplit enum to compute.

  • nodes (List) – The nodes that are contained in the county.

  • contains (Set) – The assignment IDs that are contained in the county.

contains

Alias for field number 2

nodes

Alias for field number 1

split

Alias for field number 0

class gerrychain.updaters.county_splits.CountySplit(value)[source]

Bases: Enum

Enum to track county splits in a partition.

Variables
  • NOT_SPLIT – The county is not split.

  • NEW_SPLIT – The county is split in the current partition.

  • OLD_SPLIT – The county is split in the parent partition.

NEW_SPLIT = 1
NOT_SPLIT = 0
OLD_SPLIT = 2
gerrychain.updaters.county_splits.compute_county_splits(partition, county_field: str, partition_field: str) Dict[str, CountyInfo][source]

Track nodes in counties and information about their splitting.

Parameters
  • partition (Partition) – The partition object to compute county splits for.

  • county_field (str) – Name of county ID field on the graph.

  • partition_field (str) – Name of the attribute in the graph that stores the partition information. The county split information will be computed with respect to this division of the graph.

Returns

A dict containing the information on how counties changed between the parent and child partitions. If there is no parent partition, then only the OLD_SPLIT and NOT_SPLIT values will be used.

Return type

Dict[str, CountyInfo]

gerrychain.updaters.county_splits.county_splits(partition_name: str, county_field_name: str) Callable[source]

Update that allows for the tracking of county splits.

Parameters
  • partition_name (str) – Name that the Partition instance will store.

  • county_field_name (str) – Name of county ID field on the graph.

Returns

The tracked data is a dictionary keyed on the county ID. The stored values are tuples of the form (split, nodes, seen). split is a CountySplit enum, nodes is a list of node IDs, and seen is a list of assignment IDs that are contained in the county.

Return type

Callable

gerrychain.updaters.county_splits.tally_region_splits(reg_attr_lst)[source]

A naive updater for tallying the number of times a region attribute is split. for each region attribute in reg_attr_lst.

Parameters

reg_attr_lst (List[str]) – A list of region names to tally splits for.

Returns

A function that takes a partition and returns a dictionary which maps the region name to the number of times that it is split in a a particular partition.

Return type

Callable

gerrychain.updaters.county_splits.total_reg_splits(partition, reg_attr)[source]

Returns the total number of times that reg_attr is split in the partition.

gerrychain.updaters.cut_edges.cut_edges(partition)[source]
Parameters

partition (Partition) – A partition of a Graph

Returns

The set of edges that are cut by the given partition.

Return type

Set[Tuple]

gerrychain.updaters.cut_edges.cut_edges_by_part(partition, previous: Set[Tuple], new_edges: Set[Tuple], old_edges: Set[Tuple]) Set[Tuple][source]

Updater function that responds to the flow of edges between different partitions.

Parameters
  • partition (Partition) – A partition of a Graph

  • previous (Set[Tuple]) – The previous set of edges for a fixed part of the given partition.

  • new_edges (Set[Tuple]) – The set of edges that have flowed into the given part of the partition.

  • old_edges (Set[Tuple]) – The set of cut edges in the previous partition.

Returns

The new set of cut edges for the newly generated partition.

Return type

Set

gerrychain.updaters.cut_edges.initialize_cut_edges(partition)[source]
Parameters

partition (Partition) – A partition of a Graph

Returns

A dictionary mapping each part of a partition to the set of edges in that part.

Return type

Dict

gerrychain.updaters.cut_edges.new_cuts(partition) Set[Tuple][source]
Parameters

partition (Partition) – A partition of a Graph

Returns

The set of edges that were not cut, but now are.

Return type

Set[Tuple]

gerrychain.updaters.cut_edges.obsolete_cuts(partition) Set[Tuple][source]
Parameters

partition (Partition) – A partition of a Graph

Returns

The set of edges that were cut, but now are not.

Return type

Set[Tuple]

gerrychain.updaters.cut_edges.put_edges_into_parts(edges: List, assignment: Dict) Dict[source]
Parameters
  • edges (List) – A list of edges in a graph which are to be separated into their respective parts within the partition according to the given assignment.

  • assignment (Dict) – A dictionary mapping nodes to their respective parts within the partition.

Returns

A dictionary mapping each part of a partition to the set of edges in that part.

Return type

Dict

class gerrychain.updaters.election.Election(name: str, parties_to_columns: Union[Dict, List], alias: Optional[str] = None)[source]

Bases: object

Represents the data of one election, with races conducted in each part of the partition.

As we vary the districting plan, we can use the same node-level vote totals to tabulate hypothetical elections. To do this manually with tallies, we would have to maintain tallies for each party, as well as the total number of votes, and then compute the electoral results and percentages from scratch every time. To make this simpler, this class provides an ElectionUpdater to manage these tallies. The updater returns an ElectionResults class giving a convenient view of the election results, with methods like wins() or percent() for common queries the user might make on election results.

Example usage:

# Assuming your nodes have attributes "2008_D", "2008_R"
# with (for example) 2008 senate election vote totals
election = Election(
    "2008 Senate",
    {"Democratic": "2008_D", "Republican": "2008_R"},
    alias="2008_Sen"
)

# Assuming you already have a graph and assignment:
partition = Partition(
    graph,
    assignment,
    updaters={"2008_Sen": election}
)

# The updater returns an ElectionResults instance, which
# we can use (for example) to see how many seats a given
# party would win in this partition using this election's
# vote distribution:
partition["2008_Sen"].wins("Republican")
Variables
  • name – The name of the election. (e.g. “2008 Presidential”)

  • parties – A list of the names of the parties in the election.

  • columns – A list of the columns in the graph’s node data that hold the vote totals for each party.

  • parties_to_columns – A dictionary mapping party names to the columns in the graph’s node data that hold the vote totals for that party.

  • tallies – A dictionary mapping party names to DataTally objects that manage the vote totals for that party.

  • updater – An ElectionUpdater object that manages the tallies and returns an ElectionResults object.

  • alias – The name that the election is registered under in the partition’s dictionary of updaters.

Parameters
  • name (str) – The name of the election. (e.g. “2008 Presidential”)

  • parties_to_columns (Union[Dict, List]) – A dictionary matching party names to their data columns, either as actual columns (list-like, indexed by nodes) or as string keys for the node attributes that hold the party’s vote totals. Or, a list of strings which will serve as both the party names and the node attribute keys.

  • alias (Optional[str], optional) – Alias that the election is registered under in the Partition’s dictionary of updaters.

class gerrychain.updaters.election.ElectionResults(election: Election, counts: Dict[str, Dict[int, float]], regions: List[int])[source]

Bases: object

Represents the results of an election. Provides helpful methods to answer common questions you might have about an election (Who won? How many seats?, etc.).

Variables
  • election – The Election object that these results are associated with.

  • totals_for_party – A dictionary mapping party names to the total number of votes that party received in each part of the partition.

  • regions – A list of regions that we would like the results for.

  • totals – A dictionary mapping each part of the partition to the total number of votes cast in that part.

  • percents_for_party – A dictionary mapping party names to the percentage of votes that party received in each part of the partition.

Note

The variable “regions” is generally called “parts” in other sections of the codebase, but we have changed it here to avoid confusion with the parameter “party” that often appears within the class.

Parameters
  • election (Election) – The Election object that these results are associated with.

  • regions (List[int]) – A list of regions that we would like to consider (e.g. congressional districts).

Counts

A dictionary mapping party names to the total number of votes that party received in each part of the partition.

Returns

None

count(party: str, region: Optional[str] = None) int[source]
Parameters
  • party (str) – Party ID.

  • region (Optional[int], optional) – ID of the part of the partition whose votes we want to tally.

Returns

The total number of votes that party received in a given region (part of the partition). If region is omitted, returns the overall vote total of party.

Return type

int

counts(party: str) Tuple[source]
Parameters

party (str) – Party ID

Returns

tuple of the total votes cast for party in each part of the partition

Return type

Tuple

efficiency_gap() float[source]

Computes the efficiency gap for this ElectionResults object.

See: efficiency_gap()

Returns

The efficiency gap for this election.

Return type

float

mean_median() float[source]

Computes the mean-median score for this ElectionResults object.

See: mean_median()

Returns

The mean-median score for this election.

Return type

float

mean_thirdian() float[source]

Computes the mean-thirdian score for this ElectionResults object.

See: mean_thirdian()

Returns

The mean-thirdian score for this election.

Return type

float

partisan_bias() float[source]

Computes the partisan bias for this ElectionResults object.

See: partisan_bias()

Returns

The partisan bias for this election.

Return type

float

partisan_gini() float[source]

Computes the Gini score for this ElectionResults object.

See: partisan_gini()

Returns

The partisan Gini score for this election.

Return type

float

percent(party: str, region: Optional[int] = None) float[source]
Parameters
  • party (str) – Party ID.

  • region (Optional[int], optional) – ID of the part of the partition whose votes we want to tally.

Returns

The percentage of the vote that party received in a given region (part of the partition). If region is omitted, returns the overall vote share of party.

Return type

float

percents(party: str) Tuple[source]
Parameters

party (str) – Party ID

Returns

The tuple of the percentage of votes that party received in each part of the partition

Return type

Tuple

seats(party: str) int[source]
Parameters

party (str) – Party name

Returns

The number of seats that party won.

Return type

int

total_votes() int[source]
Returns

The total number of votes cast in the election.

Return type

int

votes(party: str) Tuple[source]

An alias for counts().

Parameters

party (str) – Party ID

Returns

tuple of the total votes cast for party in each part of the partition

Return type

Tuple

wins(party: str) int[source]

An alias for seats().

Parameters

party (str) – Party name

Returns

The number of seats that party won.

Return type

int

won(party: str, region: str) bool[source]
Parameters
  • party (str) – Party ID

  • region (str) – ID of the part of the partition whose votes we want to tally.

Returns

Answer to “Did party win the region in part region?”

Return type

bool

class gerrychain.updaters.election.ElectionUpdater(election: Election)[source]

Bases: object

The updater for computing the election results in each part of the partition after each step in the Markov chain. The actual results are returned to the user as an ElectionResults instance.

Variables

election – The Election object that this updater is associated with.

get_previous_values(partition) Dict[str, Dict[int, float]][source]
Parameters

partition (Partition) – The partition whose parent we want to obtain the previous vote totals from.

Returns

A dictionary mapping party names to the vote totals that party received in each part of the parent of the current partition.

Return type

Dict[str, Dict[int, float]]

gerrychain.updaters.election.format_part_results(percents_for_party: Dict[str, Dict[int, float]], part: int) str[source]
Parameters
  • percents_for_party (Dict[str, Dict[int, float]]) – A dictionary mapping party names to a dict containing the percentage of votes that party received in each part of the partition.

  • part (int) – The part of the partition whose results we want to format.

Returns

A formatted string containing the results for the given part of the partition.

Return type

str

gerrychain.updaters.election.get_percents(counts: Dict, totals: Dict) Dict[source]
Parameters
  • counts (Dict) – A dictionary mapping each part in a partition to the count of the number of votes that a party received in that part.

  • totals (Dict) – A dictionary mapping each part in a partition to the total number of votes cast in that part.

Returns

A dictionary mapping each part in a partition to the percentage

Return type

Dict

gerrychain.updaters.flows.compute_edge_flows(partition) Dict[source]
Parameters

partition (Partition) – A partition of a Graph

Returns

A flow dictionary containing the flow from the parent of this partition to this partition. This dictionary is of the form {part: {‘in’: <set of edges that flowed in>, ‘out’: <set of edges that flowed out>}}.

Return type

Dict

gerrychain.updaters.flows.create_flow()[source]
gerrychain.updaters.flows.flows_from_changes(old_partition, new_partition) Dict[source]
Parameters
  • old_partition (Partition) – A partition of a Graph representing the previous step.

  • new_partition (Partition) – A partition of a Graph representing the current step.

Returns

A dictionary mapping each node that changed assignment between the previous and current partitions to a dictionary of the form {‘in’: <set of nodes that flowed in>, ‘out’: <set of nodes that flowed out>}.

Return type

Dict

gerrychain.updaters.flows.neighbor_flips(partition) Set[Tuple][source]
Parameters

partition (Partition) – A partition of a Graph

Returns

The set of edges that were flipped in the given partition.

Return type

Set[Tuple]

gerrychain.updaters.flows.on_edge_flow(initializer: Callable, alias: str) Callable[source]

Use this decorator to create an updater that responds to flows of cut edges between parts of the partition.

Decorate a function that takes: - The partition - The previous value of the updater for a fixed part P_i - The new cut edges that are just joining P_i at this step - The old cut edges that are just leaving P_i at this step and returns: - The new value of the updater for the fixed part P_i.

This will create an updater whose values are dictionaries of the form {part: <value of the given function on the part>}.

The initializer, by contrast, should take the entire partition and return the entire {part: <value>} dictionary.

Example:

@on_edge_flow(initializer, alias='my_updater')
def my_updater(partition, previous, new_edges, old_edges):
    # return new value of the part
Parameters
  • initializer (Callable) – A function that takes the partition and returns a dictionary of the form {part: <value>}.

  • alias (str) – The name of the updater to be created.

Returns

A decorator that takes a function as input and returns a wrapped function.

Return type

Callable

gerrychain.updaters.flows.on_flow(initializer: Callable, alias: str) Callable[source]

Use this decorator to create an updater that responds to flows of nodes between parts of the partition.

Decorate a function that takes: - The partition - The previous value of the updater on a fixed part P_i - The new nodes that are just joining P_i at this step - The old nodes that are just leaving P_i at this step and returns: - The new value of the updater for the fixed part P_i.

This will create an updater whose values are dictionaries of the form {part: <value of the given function on the part>}.

The initializer, by contrast, should take the entire partition and return the entire {part: <value>} dictionary.

Example:

@on_flow(initializer, alias='my_updater')
def my_updater(partition, previous, new_nodes, old_nodes):
    # return new value for the part
Parameters
  • initializer (Callable) – A function that takes the partition and returns a dictionary of the form {part: <value>}.

  • alias (str) – The name of the updater to be created.

Returns

A decorator that takes a function as input and returns a wrapped function.

Return type

Callable

class gerrychain.updaters.locality_split_scores.LocalitySplits(name: str, col_id: str, pop_col: str, scores_to_compute: List[str] = ['num_parts'], pent_alpha: float = 0.05)[source]

Bases: object

Computes various splitting measures for a partition

Can be used to compute how a districting plan splits against any static attribute. The prototypical example is to consider how a districting plan subdivides counties or municipalities, but other units, such as city neighborhoods, state legislative districts, or Census tracts could be treated as ‘localities’

Example usage:

# Assuming your nodes have attributes "countyID"
# with (for example) the name of the county that
# node lies in and a population attribute "pop":
county_splits = LocalitySplits(
    "countysplits",
    "countyID",
    "pop",
    ["num_parts", "symmetric_entropy","power_entropy"],
    pent_alpha = 0.8
)
# Assuming you already have a graph and assignment:
partition = Partition(
    graph,
    assignment,
    updaters={"county_splits" : county_splits}
)
# The updater returns an dictionary instance, which
# at each step of the chain has the name of the score
# and its value at that step
Variables
  • name – The name of the updater (e.g. “countysplits”)

  • col_id – The name of the column containing the locality attribute (i.e. county ids, municipality names, etc.)

  • pop_col – The name of the column containing population counts.

  • scores_to_compute – A list/tuple/set of strings naming the score functions to compute at each step. This will generally be some subcollection of `['num_parts', 'num_pieces', 'naked_boundary', 'shannon_entropy', 'power_entropy', 'symmetric_entropy', 'num_split_localities']`

  • pent_alpha – A number between 0 and 1 which is passed as the exponent to power_entropy()

  • localities – A list containing the unique locality identifiers (e.g. county names, municipality names, etc.) for the partition. This list is populated using the locality data stored on each of the nodes in the graph.

  • localitydict – A dictionary mapping node IDs to locality IDs. This is used to quickly look up the locality of a given node.

  • locality_splits – A dictionary mapping district IDs to a counter of localities in that district. That is to say, this tells us how many nodes in each district are of the given locality type.

  • locality_splits_inv – The inverted dictionary of locality_splits

  • allowed_pieces – A dictionary that maps each locality to the minimum number of districts that locality must touch. This is computed using the ideal district population. NOT CURRENTLY USED.

  • scores – A dictionary initialized with the key values from the initializer’s scores_to_compute parameter. The initial values are set to none and are updated in each call to store the compted score value for each metric of interest.

Parameters
  • name (str) – The name of the updater (e.g. “countysplits”)

  • col_id (str) – The name of the column containing the locality attribute (i.e. county ids, municipality names, etc.)

  • pop_col (str) – The name of the column containing population counts.

  • scores_to_compute (List[str], optional) – A list/tuple/set of strings naming the score functions to compute at each step. This should be some subcollection of `['num_parts', 'num_pieces', 'naked_boundary', 'shannon_entropy', 'power_entropy', 'symmetric_entropy', 'num_split_localities']`. Default is [“num_parts”].

  • pent_alpha (float, optional) – A number between 0 and 1 which is passed as the exponent to power_entropy(). Default is 0.05.

naked_boundary(partition) int[source]
Computes the number of cut edges inside localities (i.e. the

number of cut edges with both endpoints in the same locality).

Parameters

partition (Partition) – The partition to be scored.

Returns

The number of cut edges within a locality.

Return type

int

num_parts(partition) int[source]

Calculates the number of unique locality-district pairs.

Parameters

partition (Partition) – The partition to be scored.

Returns

The number of parts, i.e. the number of unique locality-district pairs.

Return type

int

num_pieces(partition) int[source]

Calculates the number of pieces.

Parameters

partition (Partition) – The partition to be scored.

Returns

Number of pieces, where each piece is formed by cutting the graph by both locality and district boundaries.

Return type

int

num_split_localities(partition) int[source]

Calculates the number of localities touching 2 or more districts.

Parameters

partition (Partition) – The partition to be scored.

Returns

The number of split localities, i.e. the number of localities touching 2 or more districts.

Return type

int

power_entropy(partition) float[source]

Computes the power entropy score of a district plan.

Parameters

partition (Partition) – The partition to be scored.

Returns

Power entropy score.

Return type

float

shannon_entropy(partition) float[source]

Computes the shannon entropy score of a district plan.

Parameters

partition (Partition) – The partition to be scored.

Returns

Shannon entropy score.

Return type

float

symmetric_entropy(partition) float[source]

Calculates the symmetric entropy score.

Warning:

This function is previously marked incomplete.
Parameters

partition (Partition) – The partition to be scored.

Returns

The symmetric square root entropy score.

Return type

float

Updaters that compute spanning tree statistics.

gerrychain.updaters.spanning_trees.num_spanning_trees(partition) Dict[int, int][source]
Returns

The number of spanning trees in each part (district) of a partition.

Return type

Dict[int, int]

class gerrychain.updaters.tally.DataTally(data: ~typing.Union[~typing.Dict, <Mock name='mock.Series' id='139929998520528'>, str], alias: str)[source]

Bases: object

An updater for tallying numerical data that is not necessarily stored as node attributes

Variables
  • data – A Dict or Series indexed by the graph’s nodes, or the string key for a node attribute containing the Tally’s data.

  • alias – The name of the tally in the Partition’s updaters dictionary

Parameters
  • data (Union[Dict, pandas.Series, str]) – A Dict or Series indexed by the graph’s nodes, or the string key for a node attribute containing the Tally’s data.

  • alias (str) – The name of the tally in the Partition’s updaters dictionary

Returns

None

alias
data
class gerrychain.updaters.tally.Tally(fields: ~typing.Union[str, ~typing.List[str]], alias: ~typing.Optional[str] = None, dtype: ~typing.Type = <class 'int'>)[source]

Bases: object

An updater for keeping a tally of one or more node attributes.

Variables
  • fields – The list of node attributes that you want to tally. Or just a single attribute name as a string.

  • alias – The aliased name of this Tally (meaning, the key corresponding to this Tally in the Partition’s updaters dictionary)

  • dtype – The type (int, float, etc.) that you want the tally to have

Parameters
  • fields (Union[str, List[str]]) – The list of node attributes that you want to tally. Or a just a single attribute name as a string.

  • alias (Optional[str], optional) – The aliased name of this Tally (meaning, the key corresponding to this Tally in the Partition’s updaters dictionary). Default is None.

  • dtype (Any, optional) – The type (int, float, etc.) that you want the tally to have. Default is int.

Returns

None

alias
dtype
fields
gerrychain.updaters.tally.compute_in_flow(graph, fields: Union[str, List[str]], flow: Dict) int[source]
Parameters
  • graph (Graph) – The graph that the partition is defined on.

  • fields (Union[str, List[str]]) – The list of node attributes that you want to tally. Or just a single attribute name as a string.

  • flow (Dict) – A dictionary containing the flow from the parent of this partition to this partition. This dictionary is of the form {part: {‘in’: <set of nodes that flowed in>, ‘out’: <set of nodes that flowed out>}}.

Returns

The sum of the “field” attribute of nodes in the “in” set of the flow.

Return type

int

gerrychain.updaters.tally.compute_out_flow(graph, fields: Union[str, List[str]], flow: Dict) int[source]
Parameters
  • graph (Graph) – The graph that the partition is defined on.

  • fields (Union[str, List[str]]) – The list of node attributes that you want to tally. Or just a single attribute name as a string.

  • flow (Dict) – A dictionary containing the flow from the parent of this partition to this partition. This dictionary is of the form {part: {‘in’: <set of nodes that flowed in>, ‘out’: <set of nodes that flowed out>}}.

Returns

The sum of the “field” attribute of nodes in the “out” set of the flow.

Return type

int