This is the xml code part that I am working on: I need to extract some specific text from the xml code.
The xml code.
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="CoreNLP-to-HTML.xsl" type="text/xsl"?>
<root>
<document>
<dependencies type="collapsed-dependencies">//the collapsed-dependencies tag
<dep type="root">
<governor idx="0">ROOT</governor>
<dependent idx="8">provide</dependent>
</dep>
<dep type="mark">
<governor idx="2">requested</governor>
<dependent idx="1">If</dependent>
</dep>
</dependencies>
</document>
</root>
I would like the output in the following format:
root(ROOT-0,provide-8)
mark(requested-2,If-1)
advcl(provide-8,requested-2)
case(TD-4,by-3)
I am able to extract each of the parameter separately, but cannot take the whole thing out in one go.
abstracts <- xpathSApply(doc,"//*/dependencies[@type='collapsed-dependencies']",xmlValue) # finds the words within collapdsed-dependencies
abstracts #value
#"ROOTproviderequestedIfproviderequestedTDby"
type <- xpathSApply(doc, "//dependencies/dep", xmlGetAttr, 'type') #gives the types
type #value
#[1] "root" "mark" "advcl" "case"
idx1 <- xpathSApply(doc, "//dependencies/dep/governor", xmlGetAttr, 'idx') # gives the idx for governor
idx1 #value
#[1] "0" "2" "8" "4" "2" "8"
# gives the idx for dependent
idx2 <- xpathSApply(doc, "//dependencies/dep/dependent", xmlGetAttr, 'idx')
Aucun commentaire:
Enregistrer un commentaire