Class Corpus

java.lang.Object
org.episteme.social.linguistics.loaders.tigerxml.Corpus
All Implemented Interfaces:
Serializable

public class Corpus extends Object implements Serializable
Represents a linguistic corpus in TIGER-XML format.

A corpus contains a sequence of annotated sentences, each with its own syntactic tree structure (composed of terminals and non-terminals). It also stores metadata and annotation specifications from the <head> section.

* @see Sentence
Since:
1.0
Author:
Silvere Martin-Michiellot, Gemini AI (Google DeepMind)
See Also:
  • Constructor Details

    • Corpus

      public Corpus()
      Creates an empty Corpus.
    • Corpus

      public Corpus(String fileName)
      Loads a corpus from a TIGER-XML file.
      Parameters:
      fileName - the XML file path.
    • Corpus

      public Corpus(String fileName, int verbosity)
      Loads a corpus from a TIGER-XML file with specified verbosity.
      Parameters:
      fileName - the XML file path.
      verbosity - verbosity level (0-5).
    • Corpus

      public Corpus(Element root)
      Creates a corpus from a DOM root element.
      Parameters:
      root - the <corpus> element.
  • Method Details