API Reference
Public functions
openbasement.extract(graph, template, entity=None, transforms=None, merge_same_as=None)
Extract structured data from an RDF graph using a template.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
Graph
|
An rdflib Graph loaded with RDF data. |
required |
template
|
str | dict
|
Template source -- built-in name (e.g. "eu_procedure"), path to YAML file, or a dict. |
required |
entity
|
str | None
|
Optional entity name to extract. If None, extracts the first (root) entity defined in the template. |
None
|
transforms
|
dict | None
|
Optional dict of custom transform name -> callable. These are merged with built-in transforms (custom takes precedence). Templates reference transforms by name via the "transform" field option. |
None
|
merge_same_as
|
bool | None
|
If True, group owl:sameAs-equivalent instances and merge their triples into one entity. If False, extract each URI as a separate entity. If None (default), uses the template's same_as_merge setting (which defaults to True). |
None
|
Returns:
| Type | Description |
|---|---|
list[dict]
|
List of extracted entity dicts. |
openbasement.load_template(source)
Load and normalize a template from various sources.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
str | Path | dict
|
One of: - str name of a built-in template (e.g. "eu_procedure") - Path to a YAML file - dict with template content |
required |
Returns:
| Type | Description |
|---|---|
dict
|
Normalized template dict. |
openbasement.list_builtin_templates()
List available built-in template names.
openbasement.audit
Drift detection: compare template predicates against actual graph content.
audit(graph, template)
Compare a template against actual predicates in a graph.
For each entity type in the template, finds all instances and checks: - Which template predicates are missing from the graph - Which graph predicates are not covered by any template field or relation - Overall coverage percentage
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
graph
|
Graph
|
An rdflib Graph to audit. |
required |
template
|
dict
|
A normalized template dict (output of load_template). |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dict with keys: "entities": per-entity audit results "summary": overall coverage stats |
Built-in transforms
| Name | Description |
|---|---|
year_from_date |
Extracts the first 4 characters (year) from a date string, e.g. "2023-07-05" -> "2023" |
uri_local_name |
Extracts the local name from a URI after # or /, e.g. "http://example.org/foo#Bar" -> "Bar" |
openbasement.transforms.apply_transform(value, transform_name, custom_transforms=None)
Apply a named transform to a value.
Looks up the transform name in custom_transforms first, then built-ins. Raises ValueError if the name is not found in either.