Helpers
create_footnote_string
Takes the text for a footnote, and returns a string with the correct formatting.
You can use this if you want to add the footnote to a string.
Currently, the newline replacement options are restricted to LINEBREAK
and NONE
.
The reserved characters <
, >
and &
will be escaped temporarily,
but they will be correctly displayed in DSP-APP.
Attention
- The text in the footnote may be richtext, i.e. contain XML tags.
- Not all tags supported in ordinary richtext are currently implemented.
- The allowed tags are:
<br>
(break line)<strong>
(bold)<em>
(italic)<u>
(underline)<strike>
(strike through)<a href="URI">
(link to a URI)<a class="salsah-link" href="Knora IRI">
(link to a resource)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
footnote_text
|
str
|
Text for the footnote |
required |
newline_replacement_option
|
NewlineReplacement
|
options to replace newlines |
LINEBREAK
|
Returns:
Type | Description |
---|---|
str
|
The footnote as a string |
Examples:
result = xmllib.create_footnote_string("Text")
# result == '<footnote content="Text"/>'
result = xmllib.create_footnote_string("Text\nSecond Line")
# result == '<footnote content="Text<br/>Second Line"/>'
result = xmllib.create_footnote_string("Already escaped <>")
# already escaped characters will not be escaped again
# result == '<footnote content="Already escaped <>"/>'
Source code in dsp/dsp-tools/src/dsp_tools/xmllib/helpers.py
19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 |
|
create_footnote_element
Takes the text for a footnote, and returns an etree.Element
.
You can use this if you are working with lxml
.
Currently, the newline replacement options are restricted to LINEBREAK
and NONE
.
Attention
- The text in the footnote may be richtext, i.e. contain XML tags.
- Not all tags supported in ordinary richtext are currently implemented.
- The allowed tags are:
<br>
(break line)<strong>
(bold)<em>
(italic)<u>
(underline)<strike>
(strike through)<a href="URI">
(link to a URI)<a class="salsah-link" href="Knora IRI">
(link to a resource)
Parameters:
Name | Type | Description | Default |
---|---|---|---|
footnote_text
|
str
|
Text for the footnote |
required |
newline_replacement_option
|
NewlineReplacement
|
options to replace newlines |
LINEBREAK
|
Returns:
Type | Description |
---|---|
_Element
|
The footnote as a string |
Source code in dsp/dsp-tools/src/dsp_tools/xmllib/helpers.py
72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 |
|
create_standoff_link_to_resource
Creates a standoff link to a resource.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
resource_id
|
str
|
ID of the resource that is linked |
required |
displayed_text
|
str
|
text to display for the embedded link |
required |
Returns:
Type | Description |
---|---|
str
|
A standoff link in string form. |
Examples:
result = xmllib.create_standoff_link_to_resource("resource_id", "Text")
# result == '<a class="salsah-link" href="IRI:resource_id:IRI">Text</a>'
Source code in dsp/dsp-tools/src/dsp_tools/xmllib/helpers.py
111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 |
|
create_standoff_link_to_uri
Creates a standoff link to a URI.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
uri
|
str
|
the target URI that should be linked to |
required |
displayed_text
|
str
|
text to display for the embedded link |
required |
Returns:
Type | Description |
---|---|
str
|
A standoff link in string form. |
Examples:
result = xmllib.create_standoff_link_to_uri("https://www.dasch.swiss/", "This is DaSCH")
# result == '<a href="https://www.dasch.swiss/">This is DaSCH</a>'
Source code in dsp/dsp-tools/src/dsp_tools/xmllib/helpers.py
144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 |
|
create_label_to_name_list_node_mapping
Often, data sources contain list values named after the "label" of the JSON project list node, instead of the "name"
which is needed for the dsp-tools xmlupload
.
To create a correct XML, you need a dictionary that maps the "labels" to their correct "names".
Parameters:
Name | Type | Description | Default |
---|---|---|---|
project_json_path
|
str
|
path to a JSON project file (a.k.a. ontology) |
required |
list_name
|
str
|
name of a list in the JSON project |
required |
language_of_label
|
str
|
which language of the label to choose |
required |
Returns:
Type | Description |
---|---|
dict[str, str]
|
a dictionary of the form {label: name} |
Examples:
"lists": [
{
"name": "listName",
"labels": {
"en": "List",
"de": "Liste"
},
"comments": { ... },
"nodes": [
{
"name": "n1",
"labels": {
"en": "Node 1",
"de": "Knoten 1"
}
},
{
"name": "n2",
"labels": {
"en": "Node 2",
"de": "Knoten 2"
}
}
]
}
]
result = xmllib.create_label_to_name_list_node_mapping(
project_json_path="project.json",
list_name="listName",
language_of_label="de",
)
# result == {"Knoten 1": "n1", "knoten 1": "n1", "Knoten 2": "n2", "knoten 2": "n2"}
Source code in dsp/dsp-tools/src/dsp_tools/xmllib/helpers.py
177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 |
|
escape_reserved_xml_characters
From richtext strings (encoding="xml"), escape the reserved characters <
, >
and &
,
but only if they are not part of a standard standoff tag or escape sequence.
See the documentation for the standard standoff tags allowed by DSP-API, which will not be escaped.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
text
|
str
|
the richtext string to be escaped |
required |
Returns:
Type | Description |
---|---|
str
|
The escaped richtext string |
Examples:
result = xmllib.escape_reserved_xml_characters("Text <unknownTag>")
# result == "Text <unknownTag>"
result = xmllib.escape_reserved_xml_characters("Text <br/> text after")
# result == "Text <br/> text after"
Source code in dsp/dsp-tools/src/dsp_tools/xmllib/helpers.py
271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 |
|
find_date_in_string
Checks if a string contains a date value (single date, or date range), and returns the first found date as DSP-formatted string, see XML documentation for details Returns None if no date was found.
Notes
- All dates are interpreted in the Christian era and the Gregorian calendar.
- BC dates are only supported in French notation (e.g. 1000-900 av. J.-C.).
- The years 0000-2999 are supported, in 3/4-digit form.
- Dates written with slashes are always interpreted in a European manner: 5/11/2021 is the 5th of November.
- In the European notation, 2-digit years are expanded to 4 digits, with the current year as watershed:
- 30.4.24 -> 30.04.2024
- 30.4.50 -> 30.04.1950
Currently supported date formats
- 0476-09-04 -> GREGORIAN:CE:0476-09-04:CE:0476-09-04
- 0476_09_04 -> GREGORIAN:CE:0476-09-04:CE:0476-09-04
- 30.4.2021 -> GREGORIAN:CE:2021-04-30:CE:2021-04-30
- 30.4.21 -> GREGORIAN:CE:2021-04-30:CE:2021-04-30
- 5/11/2021 -> GREGORIAN:CE:2021-11-05:CE:2021-11-05
- Jan 26, 1993 -> GREGORIAN:CE:1993-01-26:CE:1993-01-26
- 28.2.-1.12.1515 -> GREGORIAN:CE:1515-02-28:CE:1515-12-01
- 25.-26.2.0800 -> GREGORIAN:CE:0800-02-25:CE:0800-02-26
- 1.9.2022-3.1.2024 -> GREGORIAN:CE:2022-09-01:CE:2024-01-03
- 1848 -> GREGORIAN:CE:1848:CE:1848
- 1849/1850 -> GREGORIAN:CE:1849:CE:1850
- 1849/50 -> GREGORIAN:CE:1849:CE:1850
- 1845-50 -> GREGORIAN:CE:1845:CE:1850
- 840-50 -> GREGORIAN:CE:840:CE:850
- 840-1 -> GREGORIAN:CE:840:CE:841
- 1000-900 av. J.-C. -> GREGORIAN:BC:1000:BC:900
- 45 av. J.-C. -> GREGORIAN:BC:45:BC:45
Parameters:
Name | Type | Description | Default |
---|---|---|---|
string
|
str
|
string to check |
required |
Returns:
Type | Description |
---|---|
str | None
|
DSP-formatted date string, or None |
Examples:
result = xmllib.find_date_in_string("1849/1850")
# result == "GREGORIAN:CE:1849:CE:1850"
result = xmllib.find_date_in_string("not a valid date")
# result == None
Source code in dsp/dsp-tools/src/dsp_tools/xmllib/helpers.py
308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 |
|
make_xsd_compatible_id
An xsd:ID may not contain all types of special characters,
and it must start with a letter or underscore.
Replace illegal characters with _
, and prepend a leading _
if necessary.
The string must contain at least one Unicode letter (matching the regex \p{L}
),
_
, !
, ?
, or number, but must not be None
, <NA>
, N/A
, or -
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_value
|
str | float | int
|
input value |
required |
Returns:
Type | Description |
---|---|
str
|
An xsd ID compatible string based on the input value |
Examples:
result = xmllib.make_xsd_compatible_id("0_Universität_Basel")
# result == "_0_Universit_t_Basel"
Source code in dsp/dsp-tools/src/dsp_tools/xmllib/helpers.py
554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 |
|
make_xsd_compatible_id_with_uuid
An xsd:ID may not contain all types of special characters,
and it must start with a letter or underscore.
Replace illegal characters with _
, and prepend a leading _
if necessary.
Additionally, add a UUID at the end.
The UUID will be different each time the function is called.
The string must contain at least one Unicode letter (matching the regex \p{L}
),
_
, !
, ?
, or number, but must not be None
, <NA>
, N/A
, or -
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
input_value
|
str | float | int
|
input value |
required |
Returns:
Type | Description |
---|---|
str
|
an xsd ID based on the input value, with a UUID attached. |
Examples:
result = xmllib.make_xsd_compatible_id_with_uuid("Universität_Basel")
# result == "Universit_t_Basel_88f5cd0b-f333-4174-9030-65900b17773d"
Source code in dsp/dsp-tools/src/dsp_tools/xmllib/helpers.py
587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 |
|
create_list_from_string
Creates a list from a string. Trailing and leading whitespaces are removed from the list items.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
string
|
str
|
input string |
required |
separator
|
str
|
The character that separates the different values in the string. For example, a comma or newline. |
required |
Returns:
Type | Description |
---|---|
list[str]
|
The list that results from splitting the input string. If the original string is empty or consists only of whitespace characters, the resulting list will be empty. |
Examples:
result = xmllib.create_non_empty_list_from_string(" One/ Two\n/", "/")
# result == ["One", "Two"]
result = xmllib.create_list_from_string(" \n ", "\n")
# result == []
Source code in dsp/dsp-tools/src/dsp_tools/xmllib/helpers.py
619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 |
|
create_non_empty_list_from_string
Creates a list from a string. Trailing and leading whitespaces are removed from the list items.
If the resulting list is empty it will raise an InputError
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
string
|
str
|
input string |
required |
separator
|
str
|
The character that separates the different values in the string. For example, a comma or newline. |
required |
resource_id
|
str | None
|
If the ID of the resource is provided, a better error message can be composed |
None
|
prop_name
|
str | None
|
If the name of the property is provided, a better error message can be composed |
None
|
Returns:
Type | Description |
---|---|
list[str]
|
The list that results from splitting the input string. |
Examples:
result = xmllib.create_non_empty_list_from_string("One\nTwo ", "\n")
# result == ["One", "Two"]
result = xmllib.create_non_empty_list_from_string(" \n/ ", "/")
# raises InputError
Source code in dsp/dsp-tools/src/dsp_tools/xmllib/helpers.py
652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 |
|