By Charles Henry
Part 3 of a 3-part series
In two recent blogs I noted the predicted imminent appearance of neuromorphic processors: new machines that will have the characteristics of biological computing, with the processors’ wiring mimicking brain synapses. These processors respond to data based on the accumulation of past experience, and the “weight” or strength of those associations causes the value of the connections and associations to change. The neural-type network is thus reset according to the new weighted values, and subsequent computation is adjusted to accommodate this experience. The blogs explored the implications of these new processors, particularly in light of the suggestive dissolving of the boundary between machines and our own minds—the potential to augment, refine, and refocus the way we think by using them.
In researching the topic, I came across an article in PLOS by Ruggero Gramatica and colleagues. The article describes the results of analyzing a very large text dataset of medical literature to reveal mechanisms of action (MOA) of certain drugs that would not otherwise be apparent to a human reader because of the vast scale of the texts in question. MOA is a term in pharmacology “that refers to the specific biochemical interaction through which a drug substance produces its pharmacological effect. A mechanism of action usually includes mention of the specific molecular targets to which the drug binds, such as an enzyme or receptor.” Understanding the MOA provides insight into the molecular interactions of the drug and the physiological paths it travels to its target. In this instance, revealing otherwise hidden mechanisms of action can not only lead to a more precise mapping of the drug’s reception but also provides evidence for other applications of the medicine that were not previously known—a remedy or cure not envisioned in the drug’s development. New uses of a medicine are discovered not through clinical trials but by mining medical literature.
This computational approach can be characterized as an inference engine, developed by Gramatica and others, that is literature-based research structured by the distributional hypothesis of linguistic theory. The analysis produced by this research relates the statistical properties of word association to the intrinsic meaning of a concept and network theory: a collection of versatile mathematical tools for representing interrelated concepts and analyzing their connections structure.
As described in the journal article, the main objective of this approach “is to provide a methodology for creating network knowledge representations, capturing the essential entities occurring in a variety of publications and connecting them into a graph whenever they co-occur in a given sentence.” This approach to date has been used almost exclusively in the sciences, particularly medicine, but because it is text based, can search and analyze, and can create visualizations for literally millions of articles, the technology is extensible to any text-based corpus. The authors note that this is in part a method “to rank the relevance of the inferences, introducing a measure based on a stochastic process (random walk) defined on the graph: this measure takes into account all paths connecting two concepts and uses the abundance and redundancy of these paths, together with their weights, as a measure of the strength of the overall relation between the concepts.”
This allows for a “correlation between distributional similarity and meaning similarity, which allows exploiting the former in order to derive the latter,” write the authors. “This hypothesis suggests the assumption that concepts occurring in the same unit of text are in some way semantically related…hopping through this knowledge network and drawing a path between any two non-adjacent concepts can be interpreted as suggesting a possible ‘sentence’ that has never actually been uttered but that can implicitly carry a new and correct idea.”
In the previous blogs, discovery by inference and the freer associations of terms and concepts it represents was contrasted with the traditional propensity in libraries to more starkly demarcate information by category and theme. We might consider exploring the utility of this new approach through a prototype that builds a text corpus that would include course syllabi and related pedagogical materials, the library catalog, articles written by local faculty, and other instances of academic expression to see what can be inferred in the way of influence: to trace an intellectual mechanism of action based on the “new sentences” of semantically associated words in order to compile a matrix of weighted strengths by which new conceptual relationships can be inferred—new sentences that were never written but nonetheless contain accurate correlations of ideas.
Imagined in another way, it would be as if a student, entering a stone and mortar library, could ask questions and construct inquiries that would cause the stone arches and walls to assemble into a new edifice that reorganized the information needed in response to the inquiry. All the original parts are kept, but redesigned to make manifest knowledge that was concealed by the architecture of the original (in the example above, the format and scope of hundreds of thousands of articles). While this is a fantasy in the three-dimensional world, it is essentially a characteristic of the digital environment, and one that starkly differentiates analog and virtual.
It reminds me of the introduction to Newton’s Principia, where he draws the inherited distinction between geometry and mechanics: “mechanics is so distinguished from geometry, that what is perfectly accurate is called geometry; what is less so is called mechanical.” Newton argues that the distinction is counterproductive, and that “geometry is founded in mechanical practice, and is nothing but that part of universal mechanics which accurately proposes and demonstrates the art of measuring.” The methodology of the Principia brought the two separate ways of seeing together (the marrying of mathematics and what was then called natural philosophy), animating the once immovable, Platonic type of geometrical, ideal phenomena. We can never see the world, or ourselves, the same way again.
 Isaac Newton, The Principia (Philosophiæ Naturalis Principia Mathematica), trans. Andrew Motte, Prometheus Books, 1995.