TITLE:
DEBRA: On the Unsupervised Learning of Concept Hierarchies from (Literary) Text
AUTHORS:
Peter J. Worth, Domagoj Doresic
KEYWORDS:
Ontology Learning, Ontology Engineering, Concept Hierarchies, Concept Mapping, Concept Maps, Artificial Intelligence, Philosophy, Natural Language Processing, Knowledge Representation, Knowledge Representation and Reasoning, Machine Learning, Natural Language Processing, NLP, Computer Science, Theoretical Computer Science, Epistemology, Metaphysics, Philosophy, Logic, Computing, Ontology, First Order Logic, Predicate Calculus
JOURNAL NAME:
International Journal of Intelligence Science,
Vol.13 No.4,
October
23,
2023
ABSTRACT: With this
work, we introduce a novel method for the unsupervised learning of conceptual
hierarchies, or concept maps as they are sometimes called, which is aimed specifically for use with literary texts, as such
distinguishing itself from the majority of research literature on the topic
which is primarily focused on building ontologies from a vast array of
different types of data sources, both structured and unstructured, to support
various forms of AI, in particular, the Semantic Web as envisioned by Tim Berners-Lee.
We first elaborate on mutually informing disciplines of philosophy and computer
science, or more specifically the relationship between
metaphysics, epistemology, ontology, computing and AI, followed by a
technically in-depth discussion of DEBRA, our dependency
tree based concept hierarchy constructor, which as its name alludes to,
constructs a conceptual map in the form of a directed graph which illustrates
the concepts, their respective relations, and the implied ontological structure
of the concepts as encoded in the text, decoded with standard Python NLP
libraries such as spaCy and NLTK. With this work we hope to both augment the
Knowledge Representation literature with opportunities for intellectual
advancement in AI with more intuitive, less analytical, and well-known
forms of knowledge representation from the cognitive science community, as well
as open up new areas of research between Computer Science and the Humanities
with respect to the application of the latest in NLP tools and techniques upon
literature of cultural significance, shedding light on existing methods of
computation with respect to documents in semantic space that effectively allows
for, at the very least, the comparison and evolution of texts through time, using vector space
math.