API Reference

Public functions

openbasement.extract(graph, template, entity=None, transforms=None, merge_same_as=None)

Extract structured data from an RDF graph using a template.

Parameters:

Name Type Description Default
graph Graph

An rdflib Graph loaded with RDF data.

required
template str | dict

Template source -- built-in name (e.g. "eu_procedure"), path to YAML file, or a dict.

required
entity str | None

Optional entity name to extract. If None, extracts the first (root) entity defined in the template.

None
transforms dict | None

Optional dict of custom transform name -> callable. These are merged with built-in transforms (custom takes precedence). Templates reference transforms by name via the "transform" field option.

None
merge_same_as bool | None

If True, group owl:sameAs-equivalent instances and merge their triples into one entity. If False, extract each URI as a separate entity. If None (default), uses the template's same_as_merge setting (which defaults to True).

None

Returns:

Type Description
list[dict]

List of extracted entity dicts.

openbasement.load_template(source)

Load and normalize a template from various sources.

Parameters:

Name Type Description Default
source str | Path | dict

One of: - str name of a built-in template (e.g. "eu_procedure") - Path to a YAML file - dict with template content

required

Returns:

Type Description
dict

Normalized template dict.

openbasement.list_builtin_templates()

List available built-in template names.

openbasement.audit

Drift detection: compare template predicates against actual graph content.

audit(graph, template)

Compare a template against actual predicates in a graph.

For each entity type in the template, finds all instances and checks: - Which template predicates are missing from the graph - Which graph predicates are not covered by any template field or relation - Overall coverage percentage

Parameters:

Name Type Description Default
graph Graph

An rdflib Graph to audit.

required
template dict

A normalized template dict (output of load_template).

required

Returns:

Type Description
dict[str, Any]

Dict with keys: "entities": per-entity audit results "summary": overall coverage stats

Built-in transforms

Name Description
year_from_date Extracts the first 4 characters (year) from a date string, e.g. "2023-07-05" -> "2023"
uri_local_name Extracts the local name from a URI after # or /, e.g. "http://example.org/foo#Bar" -> "Bar"

openbasement.transforms.apply_transform(value, transform_name, custom_transforms=None)

Apply a named transform to a value.

Looks up the transform name in custom_transforms first, then built-ins. Raises ValueError if the name is not found in either.