On-Demand Correlation Matrix Datasets

for Hidden Relationship Detection in Data & Training in Artificial Intelligence (AI) Systems






Our Mission

"The Next Big Breakthrough in AI Will Be Around Language" - Harvard Business Review

While data might be the new oil, the dataset is the refined gasoline that powers every Machine Learning (ML) and AI operation.

We focus on context-controlled NLP/NLU (Natural Language Processing/Understanding) and feature engineering for hidden relationship detection in data. Our platform powers advanced approaches in Artificial Intelligence (AI) and Machine Learning (ML) using experimental and formal language models including well-known models such as OpenAI's GPT-3 (2020), Google's BERT (2018), word2vec (2013) and others based on vector space methods developed at Lawrence Berkeley National Laboratory (2008).

Our platform powers research groups, data vendors, funds and institutions by generating on-demand NLP/NLU correlation matrix datasets. We are particularly interested in how we can get machines to trade information with one another or exchange and transact data in a way that minimizes a selected loss function. Our objective is to enable any group analyzing data to save time by testing a hypothesis or running experiments with higher throughput. This can increase the speed of innovation, novel scientific breakthroughs and discoveries. For a little more on who we are, see our latest reddit AMA on r/AskScience (here)!


The Technology

It's Not Magic

At our system's core, as you see in near real-time (here), you'll notice we're processing information, such as peer-reviewed scientific literature, as it's published by the minute from the National Library of Medicine along with other news and information sources. We then process this data by modeling it with a combination or ensemble of language models including OpenAI's GPT-3 (2020), Google's BERT (2018), word2vec (2013) and others based on vector space methods developed at Lawrence Berkeley National Laboratory (2008). Vector space indices are built, relationships are calculated with resulting datasets available via an API
.

Backed by Patents

Our proprietary algorithms are based on variants proven within advanced Natural Language Processing/Understanding (NLP/NLU) and Machine Learning (ML) applications. These variants are based on patents our team has invented including a System and Method for Generating a Relationship Network - K Franks, CA Myers, RM Podowski - US Patent 7,987,191, 2011 in collaboration with Lawrence Berkeley National Laboratory.

Data Provenance, Lineage and Governance

You shoudn't be. Data lineage and provenance is important. Our Data Provenance Pipeline (DPP) hash controls data lineage. Knowing where your data comes from and knowing how reliable it is, is extremely important especially in the areas of Bioscience and financial institutions that rely on robust data to make billion-dollar decisions everyday.

Algorithmically Generated Datasets

Our platform enables deep analysis of global trends in peer reviewed published scientific papers, patents, news, Internet data and research breakthroughs for the purpose of context-controlled NLP, NLU, NLI, NLG & sentiment analysis enabling unique opportunities for continuous information arbitrage [PDF pg. 17].

VXV wallet-enabled API functionality

Our VXV wallet-enabled API key allows any company to subscribe to our API services to stream NLP/NLU context-controlled datasets on-demand, up to 1440 calls per day, for real-time analysis. Tiered subscription levels, with each level requiring a different amount of VXV, allow for specialized services and give advanced users the ability to “corner the market” on select, high-value datasets. Other crucial features enabled by VXV relate to 1) optional data provenance and tracking services for a series of parameters, calculations and original data source metrics executed for each dataset snapshot and 2) real-time machine-to-machine transacting of feature vectors to minimize a loss function.

Vectorspace 'Smart Baskets'

Vectorspace context-controllable correlation matrix datasets can be used to create what we call 'Smart Baskets'. These are groups of assets such as equities or cryptocurrencies that share known and hidden relationships with one another. Detecting hidden relationships between equities, entities and global events based on sympathetic, symbiotic, parasitic or latent entanglement can result in unique opportunities connected to 'information arbitrage'.