A new brain model of language
Semantic Folding Theory
and its application in Semantic Fingerprinting
A Cortical.io White Paper Version 1.0
Author: Francisco E. De Sousa Webber
Natural Language Understanding inspired by neuroscience
With Semantic Folding:
Words, sentences and whole texts can be compared to each other
NLP tasks like classification and semantic search are highly efficient
The system is trained in a fully unsupervised manner
No need for large language models nor expensive computing resources
Taking the Hierarchical Temporal Memory (HTM) theory, a computational theory of the human cortex developed by Numenta, as a starting point, Cortical.io has developed Semantic Folding, a corresponding theory of language representation.
Semantic Folding describes a method of converting text into a semantically grounded representation called a semantic fingerprint. Semantic fingerprints are Sparse Distributed Representations (SDR) of words: large binary vectors that are very sparsely filled, with every bit representing distinct semantic information.
Many practical problems of statistical Natural Language Processing (NLP) systems and, more recently, of Transformer models, like the necessity of creating large training data sets, the high cost of computation, the fundamental incongruity of precision and recall, the complex tuning procedures, etc., can be elegantly overcome by applying Semantic Folding to text processing.
Semantic Folding Simply Explained:
Watch a Short Video
Semantic Folding converts text in semantic fingerprints, encapsulating meaning in a topographical representation.
Semantic fingerprints allow direct comparison of the meanings of any two pieces of text, showing thousands of semantic relations.
If two semantic fingerprints look similar, it means that the texts are semantically similar too.
With Semantic Folding, semantic spaces are stable across languages, enabling direct comparison of text across languages without machine translation.
How does Semantic Folding work?
To begin with, we select reference material that represents the domain the system will work in – Wikipedia for applications using general English, or domain-related collections of documents for industry-specific applications.
Then, the reference documents are cut into context-based snippets which are distributed over a 2D matrix, in such a way that snippets with similar topics (sharing many common words) are placed close to each other on the map. This process creates a 2D semantic map.
In the next step, a vector is created for each word contained in the reference documents, by activating the positions of all snippets containing this word. This produces a large, binary, very sparsely filled vector called a Semantic Fingerprint.
A Semantic Fingerprint is a vector of 16,384 bits (128×128) where every bit stands for a concrete context (topic) that can be realized as a bag of words of the training snippets at this position.
The whole Semantic Folding process is fully unsupervised.
Applications of Semantic Folding
Semantic Folding builds the basis for high-level natural language processing functionalities that can be integrated in many different applications.
- Semantic fingerprints can be generated for language elements like words, sentences and entire documents.
- Any two pieces of text can be compared, regardless of length or language.
- Computational operations can be performed on the meaning of text data by measuring the overlap of semantic fingerprints.
Semantic fingerprints work particularly well for NLP tasks like:
- Classification: instead of training the classifier with many labeled examples, one reference fingerprint can be used to describe a class
- Semantic search: comparing the semantic overlap between the semantic fingerprint of a query in natural language and the fingerprints of the indexed documents proves to be both highly accurate and efficient.
Advantages of Semantic Folding
High Accuracy
Semantic fingerprints leverage a rich semantic feature set of 16k parameters, enabling a fine-grained disambiguation of words and concepts.
High Efficiency
Semantic Folding requires order of magnitude less training material (100s vs, 1’000s) and less compute resources because it uses sparse distributed vectors.
High Transparency & Explainability
Each semantic feature can be inspected at the document level so that biases can be eliminated in the models and results explained.
High Flexibility & Scalability
Semantic Folding can be applied to any language and use case and business users can easily customize models.