Relation:Import XML
In the first step, import creates a relation where each value and each attribute and its path is extracted.
In the second step, create a column id by identifying a row id in the path. Extract it from the path with regexreplace.
In the third step, create a column key for the columns you are interested and extract it from the path with regexreplace.
In the fourth step, project only id, key and text.
In the fifth step, deserialize.
Example
Using the file books.xml
path | text |
---|---|
catalog:1:book:0:id | bk101 |
catalog:1:book:1:author | Gambardella, Matthew |
catalog:1:book:2:title | XML Developer's Guide |
catalog:1:book:3:genre | Computer |
catalog:1:book:4:price | 44.95 |
catalog:1:book:5:publish_date | 2000-10-01 |
catalog:1:book:6:description | An in-depth look at creating applications with XML. |
catalog:2:book:0:id | bk102 |
catalog:2:book:1:author | Ralls, Kim |
catalog:2:book:2:title | Midnight Rain |
catalog:2:book:3:genre | Fantasy |
catalog:2:book:4:price | 5.95 |
catalog:2:book:5:publish_date | 2000-12-16 |
catalog:2:book:6:description | A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world. |
catalog:3:book:0:id | bk103 |
id | key | text |
---|---|---|
1 | id | bk101 |
1 | author | Gambardella, Matthew |
1 | title | XML Developer's Guide |
1 | genre | Computer |
1 | price | 44.95 |
1 | publish_date | 2000-10-01 |
1 | description | An in-depth look at creating applications with XML. |
2 | id | bk102 |
2 | author | Ralls, Kim |
2 | title | Midnight Rain |
2 | genre | Fantasy |
2 | price | 5.95 |
2 | publish_date | 2000-12-16 |
2 | description | A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world. |
3 | id | bk103 |
id | author | title | genre | price | publish_date | description |
---|---|---|---|---|---|---|
bk101 | Gambardella, Matthew | XML Developer's Guide | Computer | 44.95 | 2000-10-01 | An in-depth look at creating applications with XML. |
bk110 | O'Brien, Tim | Microsoft .NET: The Programming Bible | Computer | 36.95 | 2000-12-09 | Microsoft's .NET initiative is explored in detail in this deep programmer's reference. |
bk111 | O'Brien, Tim | MSXML3: A Comprehensive Guide | Computer | 36.95 | 2000-12-01 | The Microsoft MSXML3 parser is covered in detail, with attention to XML DOM interfaces, XSLT processing, SAX and more. |
bk112 | Galos, Mike | Visual Studio 7: A Comprehensive Guide | Computer | 49.95 | 2001-04-16 | Microsoft Visual Studio 7 is explored in depth, looking at how Visual Basic, Visual C++, C#, and ASP+ are integrated into a comprehensive development environment. |
bk102 | Ralls, Kim | Midnight Rain | Fantasy | 5.95 | 2000-12-16 | A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world. |
bk103 | Corets, Eva | Maeve Ascendant | Fantasy | 5.95 | 2000-11-17 | After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society. |
bk104 | Corets, Eva | Oberon's Legacy | Fantasy | 5.95 | 2001-03-10 | In post-apocalypse England, the mysterious agent known only as Oberon helps to create a new life for the inhabitants of London. Sequel to Maeve Ascendant. |
bk105 | Corets, Eva | The Sundered Grail | Fantasy | 5.95 | 2001-09-10 | The two daughters of Maeve, half-sisters, battle one another for control of England. Sequel to Oberon's Legacy. |
bk106 | Randall, Cynthia | Lover Birds | Romance | 4.95 | 2000-09-02 | When Carla meets Paul at an ornithology conference, tempers fly as feathers get ruffled. |
bk107 | Thurman, Paula | Splish Splash | Romance | 4.95 | 2000-11-02 | A deep sea diver finds true love twenty thousand leagues beneath the sea. |
bk108 | Knorr, Stefan | Creepy Crawlies | Horror | 4.95 | 2000-12-06 | An anthology of horror stories about roaches, centipedes, scorpions and other insects. |
bk109 | Kress, Peter | Paradox Lost | Science Fiction | 6.95 | 2000-11-02 | After an inadvertant trip through a Heisenberg Uncertainty Device, James Salway discovers the problems of being quantum. |
See also
' XML files can be imported and relations can be extraxted from the data.
' In the first step, '''import''' creates a relation where each value and each attribute and its path is extracted.
' In the second step, create a column '''id''' by identifying a row id in the path. Extract it from the path with '''regexreplace'''.
' In the third step, create a column '''key''' for the columns you are interested and extract it from the path with '''regexreplace'''.
' In the fourth step, project only id, key and text.
' In the fifth step, deserialize.
' ===Example===
' Using the file [[Media:books.xml]]
import "books.xml"
print 15
// the id is the number before book
extend id regexreplace( path , ".*:(d+):book.*" , "$1" )
// the key is the word after book
extend key regexreplace( path , ".*book:d+:(.*?)" , "$1" )
project id, key, text
print 15
deserialize