Jump to content

DELPH-IN: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
No edit summary
Adding short description: "Collaborative linguistics project"
 
(11 intermediate revisions by 9 users not shown)
Line 1: Line 1:
{{Short description|Collaborative linguistics project}}
{{Infobox|name = DELPH-IN
{{Infobox|name = DELPH-IN
|bodystyle =
|bodystyle =
Line 11: Line 12:
|header4 = DELPH-IN Summits
|header4 = DELPH-IN Summits
|label5 = Inaugural:
|label5 = Inaugural:
|data5 = [http://moin.delph-in.net/LisbonTop LisbonTop] (2005)
|data5 = [https://github.com/delph-in/docs/wiki/LisbonTop LisbonTop] (2005)
|label6 = Latest:
|label6 = Latest:
|data6 = [http://moin.delph-in.net/SingaporeTop SingaporeTop] (2015)
|data6 = [https://github.com/delph-in/docs/wiki/Virtual2021Top Virtual2021Top] (2021)
|label7 = Upcoming:
|label7 = Upcoming:
|data7 = [https://github.com/delph-in/docs/wiki/FairhavenTop FairhavenTop] (2022)
|data7 = TBA
<!-- |header8 = External Links
<!-- |header8 = External Links
|label9 = DELPH-IN website:
|label9 = DELPH-IN website:
|data9 =
|data9 =
|label10 = DELPH-IN wiki:
|label10 = DELPH-IN wiki:
|data10 = http://moin.delph-in.net/ -->
|data10 = https://github.com/delph-in/docs/wiki -->
|belowstyle = background:#ddf;
|belowstyle = background:#ddf;
|below =
|below =
}}
}}


'''DE'''ep '''L'''inguistic '''P'''rocessing with '''H'''PSG - '''IN'''itiative ('''DELPH-IN''') is a collaboration where computational linguists worldwide develop [[natural language processing]] tools for [[deep linguistic processing]] of human language.<ref>[http://www.delph-in.net/ DELPH-IN: Open-Source Deep Processing]</ref> The goal of DELPH-IN is to combine linguistic and statistical processing methods in order to computationally understand the meaning of texts and utterances.
'''De'''ep '''L'''inguistic '''P'''rocessing with '''H'''PSG - '''IN'''itiative ('''DELPH-IN''') is a collaboration where computational linguists worldwide develop [[natural language processing]] tools for [[deep linguistic processing]] of human language.<ref>[http://www.delph-in.net/ DELPH-IN: Open-Source Deep Processing]</ref> The goal of DELPH-IN is to combine linguistic and statistical processing methods in order to computationally understand the meaning of texts and utterances.


The tools developed by DELPH-IN adopts two linguistics formalisms for deep linguistic analysis, viz. [[head-driven phrase structure grammar]] (HPSG) and [[minimal recursion semantics]] (MRS).<ref>Ann Copestake, Dan Flickinger, Carl Pollard and Ivan A. Sag. 2005. [http://lingo.stanford.edu/sag/papers/copestake.pdf Minimal Recursion Semantics: An Introduction]. In Proceedings of Research on Language and Computation.</ref> All tools under the DELPH-IN collaboration are developed for general use of [[open-source]] licensing.
The tools developed by DELPH-IN adopt two linguistic formalisms for deep linguistic analysis, viz. [[head-driven phrase structure grammar]] (HPSG) and [[minimal recursion semantics]] (MRS).<ref>Ann Copestake, Dan Flickinger, Carl Pollard and Ivan A. Sag. 2005. [http://lingo.stanford.edu/sag/papers/copestake.pdf Minimal Recursion Semantics: An Introduction] {{Webarchive|url=https://web.archive.org/web/20120717034844/http://lingo.stanford.edu/sag/papers/copestake.pdf |date=2012-07-17 }}. In Proceedings of Research on Language and Computation.</ref> All tools under the DELPH-IN collaboration are developed for general use of [[open-source license|open-source]] licensing.


Since 2005, DELPH-IN has held an annual Summit. This is a loosely structured [[unconference]] where people update each other about the work they are doing, seek feedback on current work, and occasionally hammer out agreement on standards and best practice.
Since 2005, DELPH-IN has held an annual summit. This is a loosely structured [[unconference]] where people update each other about the work they are doing, seek feedback on current work, and occasionally hammer out agreement on standards and best practice.


== DELPH-IN Technologies and Resources ==
== DELPH-IN technologies and resources ==
The DELPH-IN collaboration has been progressively building computational tools for [[deep linguistic processing|deep linguistic analysis]] such as the:
The DELPH-IN collaboration has been progressively building computational tools for [[deep linguistic processing|deep linguistic analysis]], such as:
* '''LKB system''' (Linguistic Knowledge Builder): a [[grammar engineering]] environment where linguists can build unification grammar with the [[Head-driven Phrase Structure Grammar]] formalism
* '''LKB system''' (Linguistic Knowledge Builder): a [[grammar engineering]] environment where linguists can build unification grammars with the [[Head-driven Phrase Structure Grammar]] formalism
* '''PET parser''' (Platform for Experimentation with efficient HPSG processing Techniques): an open source parser which produces [[HPSG]] parse trees with [[Minimal Recursion Semantics]] (MRS) outputs <ref>[http://pet.opendfki.de/wiki PET Parser website]</ref>
* '''PET parser''' (Platform for Experimentation with efficient HPSG processing Techniques): an open source parser which produces [[HPSG]] parse trees with [[Minimal Recursion Semantics]] (MRS) outputs <ref>{{Cite web |url=http://pet.opendfki.de/wiki |title=PET Parser website |access-date=2013-07-30 |archive-date=2022-03-29 |archive-url=https://web.archive.org/web/20220329234307/http://pet.opendfki.de/wiki |url-status=dead }}</ref>
* '''ACE processor''' (Answer Constraint Engine): an efficient system to process DELPH-IN grammars that provide [[HPSG]] syntactic parses with [[Minimal Recursion Semantics|MRS]] outputs. The latest version of ACE is able to [[natural language generation|generate natural language]] sentences.<ref>[http://sweaglesw.org/linguistics/ace/ ACE parser/generator homepage]</ref>
* '''ACE processor''' (Answer Constraint Engine): an efficient system to process DELPH-IN grammars that provide [[HPSG]] syntactic parses with [[Minimal Recursion Semantics|MRS]] outputs. The latest version of ACE is able to [[natural language generation|generate natural language]] sentences.<ref>[http://sweaglesw.org/linguistics/ace/ ACE parser/generator homepage]</ref>
* '''LOGON infrastructure''' is a collection of software and DELPH-IN grammars to provide [[transfer-based machine translation]]. The LOGON approach to machine translation has proven to provide quality oriented hybrid (rule-based and stochastic) translations.<ref>Stephan Oepen, Erik Velldal, Jan Tore Lønning, Paul Meurer, Victoria Rosén, and Dan Flickinger. 2007.[http://mt-archive.info/TMI-2007-Oepen.pdf Towards hybrid quality-oriented machine translation. On linguistics and probabilities in MT]. In Proceedings of the 10th International Conference on Theoretical and Methodological Issues in Machine Translation, pp.144–153. Skövde, Sweden.</ref>
* '''LOGON infrastructure''' is a collection of software and DELPH-IN grammars to provide [[transfer-based machine translation]]. The LOGON approach to machine translation has proven to provide quality oriented hybrid (rule-based and stochastic) translations.<ref>Stephan Oepen, Erik Velldal, Jan Tore Lønning, Paul Meurer, Victoria Rosén, and Dan Flickinger. 2007.[http://mt-archive.info/TMI-2007-Oepen.pdf Towards hybrid quality-oriented machine translation. On linguistics and probabilities in MT] {{Webarchive|url=https://web.archive.org/web/20200806142305/http://mt-archive.info/TMI-2007-Oepen.pdf |date=2020-08-06 }}. In Proceedings of the 10th International Conference on Theoretical and Methodological Issues in Machine Translation, pp.144–153. Skövde, Sweden.</ref>

<br>
Other than deep linguistic processing tools, the DELPH-IN collaboration supplies computational resources for [[Natural Language Processing]] such as computational HPSG grammars and language prototypes e.g.:
Other than deep linguistic processing tools, the DELPH-IN collaboration supplies computational resources for [[Natural Language Processing]] such as computational HPSG grammars and language prototypes e.g.:
* '''DELPH-IN grammars''': a catalogue of computational HPSG grammar hand-crafted to capture deep linguistics analysis specific to the respective languages <ref>[http://moin.delph-in.net/GrammarCatalogue DELPH-IN catalog of grammars]</ref>
* '''DELPH-IN grammars''': a catalogue of computational HPSG grammar hand-crafted to capture deep linguistics analysis specific to the respective languages <ref>[http://moin.delph-in.net/GrammarCatalogue DELPH-IN catalog of grammars]</ref>
* '''LinGO Grammar Matrix''': an open-source starter-kit for rapid prototyping of precision broad-coverage grammars compatible with the LKB. It contains a library of common language phenomena that computational grammarians can inherit for their HPSG grammars.<ref>Fokkens, Antske, Emily M. Bender and Varvara Gracheva. 2012. [http://moin.delph-in.net/MatrixDocTop|LinGO Grammar Matrix Customization System Documentation]. Online resource.</ref>
* '''LinGO Grammar Matrix''': an open-source starter-kit for rapid prototyping of precision broad-coverage grammars compatible with the LKB. It contains a library of common language phenomena that computational grammarians can inherit for their HPSG grammars.<ref>Fokkens, Antske, Emily M. Bender and Varvara Gracheva. 2012. [https://archive.today/20130730135014/http://moin.delph-in.net/MatrixDocTop LinGO Grammar Matrix Customization System Documentation]. Online resource.</ref>
* '''CLIMB libraries''' (Comparative Libraries of Implementations with Matrix Basis): an extended language library built on the Grammar Matrix. The objective of the CLIMB library is to maintain alternative analyses of the same phenomenon across different languages to test their impact on long-term grammar development.<ref>Fokkens, A., Avgustinova, T., and Zhang, Y. 2012. [http://www.coli.uni-saarland.de/~afokkens/materials/Fokkens-Avgustinova-Zhang-LREC2012.pdf Climb grammars: three projects using metagrammar engineering]. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12),Istanbul, Turkey.</ref>
* '''CLIMB libraries''' (Comparative Libraries of Implementations with Matrix Basis): an extended language library built on the Grammar Matrix. The objective of the CLIMB library is to maintain alternative analyses of the same phenomenon across different languages to test their impact on long-term grammar development.<ref>Fokkens, A., Avgustinova, T., and Zhang, Y. 2012. [http://www.coli.uni-saarland.de/~afokkens/materials/Fokkens-Avgustinova-Zhang-LREC2012.pdf Climb grammars: three projects using metagrammar engineering]. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), Istanbul, Turkey.</ref>

<br>
Another range of DELPH-IN resources are not unlike the data use for [[shallow linguistic processing]], such as [[Text_corpus]] and [[treebanks]]:
Another range of DELPH-IN resources are not unlike the data use for [[shallow linguistic processing]], such as [[Text corpus]] and [[treebanks]]:
* '''MRS Test Suite''': a short but representative set of sentences designed to capture some [[minimal recursion semantics]] phenomena. The test suites are available in Bulgarian, English, French, German, Greek, Japanese, Mandarin, Norwegian, Portuguese, Russian and Spanish.<ref>[http://moin.delph-in.net/MatrixMrsTestSuite MRS Test Suite page]</ref>
* '''MRS Test Suite''': a short but representative set of sentences designed to capture some [[minimal recursion semantics]] phenomena. The test suites are available in Bulgarian, English, French, German, Greek, Japanese, Mandarin, Norwegian, Portuguese, Russian and Spanish.<ref>[http://moin.delph-in.net/MatrixMrsTestSuite MRS Test Suite page]</ref>
* '''Wikiwoods''': WikiWoods is a [[parsed corpus]] that provides rich syntacto-semantic annotations for the English Wikipedia.<ref>Dan Flickinger, Stephan Oepen, and Gisle Ytrestøl. 2010. [http://www.delph-in.net/wikiwoods/lrec10.pdf WikiWoods: Syntacto-semantic annotation for English Wikipedia]. In Proceedings of LREC-2010, pages 1665–1671.</ref>
* '''Wikiwoods''': WikiWoods is a [[parsed corpus]] that provides rich syntacto-semantic annotations for the English Wikipedia.<ref>Dan Flickinger, Stephan Oepen, and Gisle Ytrestøl. 2010. [http://www.delph-in.net/wikiwoods/lrec10.pdf WikiWoods: Syntacto-semantic annotation for English Wikipedia]. In Proceedings of LREC-2010, pages 1665–1671.</ref>
* '''DeepBank''': an ongoing project to annotate the one million words of 1989 Wall Street Journal text (the same set of sentences annotated in the original Penn Treebank project) with the English Resource Grammar, augmented with a robust approximating PCFG for complete coverage.<ref>Dan Flickinger, Valia Kordoni and Yi Zhang. 2012. [http://dfki.de/lt/publication_show.php?id=6619 DeepBank: A Dynamically Annotated Treebank of the Wall Street Journal]. In Proceedings of TLT-11, Lisbon, Portugal.</ref><ref>[http://moin.delph-in.net/DeepBank DeepBank homepage]</ref>
* '''DeepBank''': an ongoing project to annotate the one million words of 1989 Wall Street Journal text (the same set of sentences annotated in the original Penn Treebank project) with the English Resource Grammar, augmented with a robust approximating PCFG for complete coverage.<ref>Dan Flickinger, Valia Kordoni and Yi Zhang. 2012. [http://dfki.de/lt/publication_show.php?id=6619 DeepBank: A Dynamically Annotated Treebank of the Wall Street Journal] {{Webarchive|url=https://web.archive.org/web/20160304081735/http://dfki.de/lt/publication_show.php?id=6619 |date=2016-03-04 }}. In Proceedings of TLT-11, Lisbon, Portugal.</ref><ref>[http://moin.delph-in.net/DeepBank DeepBank homepage]</ref>
* '''Cathedral and the Bazaar''': a compilation of an early essay on Open Source by Eric Raymond with translations into multiple languages. It was proposed as a multilingual shared test suite to enable us to compare parses across different grammars.<ref>[http://moin.delph-in.net/MatrixMrsCatb DELPH-IN CatB page]</ref><ref>[http://catb.org/~esr/writings/cathedral-bazaar/ Official Cathedral and the Bazaar webpage]</ref>
* '''Cathedral and the Bazaar''': a compilation of an early essay on Open Source by Eric Raymond with translations into multiple languages. It was proposed as a multilingual shared test suite to enable us to compare parses across different grammars.<ref>[http://moin.delph-in.net/MatrixMrsCatb DELPH-IN CatB page]</ref><ref>[http://catb.org/~esr/writings/cathedral-bazaar/ Official Cathedral and the Bazaar webpage]</ref>

<br>
The open-source culture of the DELPH-IN collaboration provides the [[Natural Language Processing]] community with an array of [[deep linguistic processing]] tools and resources. However, the usability of DELPH-IN tools has been an issue with users and application developers new to the DELPH-IN ecology.{{Citation needed|date=July 2014}} The DELPH-IN developers are aware of these usability issues and there are ongoing attempts to improve documentation and tutorials of DELPH-IN technologies.<ref>[http://moin.delph-in.net/SaarlandUseability DELPH-IN 2013 Summit: Special Interest Group in Useability]</ref>
The open-source culture of the DELPH-IN collaboration provides the [[Natural Language Processing]] community with an array of [[deep linguistic processing]] tools and resources. However, the usability of DELPH-IN tools has been an issue with users and application developers new to the DELPH-IN ecology.{{Citation needed|date=July 2014}} The DELPH-IN developers are aware of these usability issues and there are ongoing attempts to improve documentation and tutorials of DELPH-IN technologies.<ref>[http://moin.delph-in.net/SaarlandUseability DELPH-IN 2013 Summit: Special Interest Group in Useability]</ref>


Line 54: Line 55:
* [[Head-driven Phrase Structure Grammar]]
* [[Head-driven Phrase Structure Grammar]]
* [[Minimal Recursion Semantics]]
* [[Minimal Recursion Semantics]]
* [[List of natural language processing toolkits]]


==References==
==References==

Latest revision as of 14:30, 6 June 2024

DELPH-IN
Academics
Discipline:
Formalisms:
Natural language processing
HPSG, MRS
DELPH-IN Summits
Inaugural:LisbonTop (2005)
Latest:Virtual2021Top (2021)
Upcoming:FairhavenTop (2022)

Deep Linguistic Processing with HPSG - INitiative (DELPH-IN) is a collaboration where computational linguists worldwide develop natural language processing tools for deep linguistic processing of human language.[1] The goal of DELPH-IN is to combine linguistic and statistical processing methods in order to computationally understand the meaning of texts and utterances.

The tools developed by DELPH-IN adopt two linguistic formalisms for deep linguistic analysis, viz. head-driven phrase structure grammar (HPSG) and minimal recursion semantics (MRS).[2] All tools under the DELPH-IN collaboration are developed for general use of open-source licensing.

Since 2005, DELPH-IN has held an annual summit. This is a loosely structured unconference where people update each other about the work they are doing, seek feedback on current work, and occasionally hammer out agreement on standards and best practice.

DELPH-IN technologies and resources[edit]

The DELPH-IN collaboration has been progressively building computational tools for deep linguistic analysis, such as:

  • LKB system (Linguistic Knowledge Builder): a grammar engineering environment where linguists can build unification grammars with the Head-driven Phrase Structure Grammar formalism
  • PET parser (Platform for Experimentation with efficient HPSG processing Techniques): an open source parser which produces HPSG parse trees with Minimal Recursion Semantics (MRS) outputs [3]
  • ACE processor (Answer Constraint Engine): an efficient system to process DELPH-IN grammars that provide HPSG syntactic parses with MRS outputs. The latest version of ACE is able to generate natural language sentences.[4]
  • LOGON infrastructure is a collection of software and DELPH-IN grammars to provide transfer-based machine translation. The LOGON approach to machine translation has proven to provide quality oriented hybrid (rule-based and stochastic) translations.[5]

Other than deep linguistic processing tools, the DELPH-IN collaboration supplies computational resources for Natural Language Processing such as computational HPSG grammars and language prototypes e.g.:

  • DELPH-IN grammars: a catalogue of computational HPSG grammar hand-crafted to capture deep linguistics analysis specific to the respective languages [6]
  • LinGO Grammar Matrix: an open-source starter-kit for rapid prototyping of precision broad-coverage grammars compatible with the LKB. It contains a library of common language phenomena that computational grammarians can inherit for their HPSG grammars.[7]
  • CLIMB libraries (Comparative Libraries of Implementations with Matrix Basis): an extended language library built on the Grammar Matrix. The objective of the CLIMB library is to maintain alternative analyses of the same phenomenon across different languages to test their impact on long-term grammar development.[8]

Another range of DELPH-IN resources are not unlike the data use for shallow linguistic processing, such as Text corpus and treebanks:

  • MRS Test Suite: a short but representative set of sentences designed to capture some minimal recursion semantics phenomena. The test suites are available in Bulgarian, English, French, German, Greek, Japanese, Mandarin, Norwegian, Portuguese, Russian and Spanish.[9]
  • Wikiwoods: WikiWoods is a parsed corpus that provides rich syntacto-semantic annotations for the English Wikipedia.[10]
  • DeepBank: an ongoing project to annotate the one million words of 1989 Wall Street Journal text (the same set of sentences annotated in the original Penn Treebank project) with the English Resource Grammar, augmented with a robust approximating PCFG for complete coverage.[11][12]
  • Cathedral and the Bazaar: a compilation of an early essay on Open Source by Eric Raymond with translations into multiple languages. It was proposed as a multilingual shared test suite to enable us to compare parses across different grammars.[13][14]

The open-source culture of the DELPH-IN collaboration provides the Natural Language Processing community with an array of deep linguistic processing tools and resources. However, the usability of DELPH-IN tools has been an issue with users and application developers new to the DELPH-IN ecology.[citation needed] The DELPH-IN developers are aware of these usability issues and there are ongoing attempts to improve documentation and tutorials of DELPH-IN technologies.[15]

See also[edit]

References[edit]

  1. ^ DELPH-IN: Open-Source Deep Processing
  2. ^ Ann Copestake, Dan Flickinger, Carl Pollard and Ivan A. Sag. 2005. Minimal Recursion Semantics: An Introduction Archived 2012-07-17 at the Wayback Machine. In Proceedings of Research on Language and Computation.
  3. ^ "PET Parser website". Archived from the original on 2022-03-29. Retrieved 2013-07-30.
  4. ^ ACE parser/generator homepage
  5. ^ Stephan Oepen, Erik Velldal, Jan Tore Lønning, Paul Meurer, Victoria Rosén, and Dan Flickinger. 2007.Towards hybrid quality-oriented machine translation. On linguistics and probabilities in MT Archived 2020-08-06 at the Wayback Machine. In Proceedings of the 10th International Conference on Theoretical and Methodological Issues in Machine Translation, pp.144–153. Skövde, Sweden.
  6. ^ DELPH-IN catalog of grammars
  7. ^ Fokkens, Antske, Emily M. Bender and Varvara Gracheva. 2012. LinGO Grammar Matrix Customization System Documentation. Online resource.
  8. ^ Fokkens, A., Avgustinova, T., and Zhang, Y. 2012. Climb grammars: three projects using metagrammar engineering. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC’12), Istanbul, Turkey.
  9. ^ MRS Test Suite page
  10. ^ Dan Flickinger, Stephan Oepen, and Gisle Ytrestøl. 2010. WikiWoods: Syntacto-semantic annotation for English Wikipedia. In Proceedings of LREC-2010, pages 1665–1671.
  11. ^ Dan Flickinger, Valia Kordoni and Yi Zhang. 2012. DeepBank: A Dynamically Annotated Treebank of the Wall Street Journal Archived 2016-03-04 at the Wayback Machine. In Proceedings of TLT-11, Lisbon, Portugal.
  12. ^ DeepBank homepage
  13. ^ DELPH-IN CatB page
  14. ^ Official Cathedral and the Bazaar webpage
  15. ^ DELPH-IN 2013 Summit: Special Interest Group in Useability

External links[edit]