FOR IMMEDIATE RELEASE
Oct 2, 2018
Dataset Engineering & API Partnership Launched by Vectorspace AI & Pushshift.io, Reddit Data Provider
VALLETTA, MALTA / October 02, 2018 / Pushshift.io, reddit.com API (Application Programming Inteface) data provider and social media data engineering company and Vectorspace AI, wallet-enabled API provider and a Natural Language Processing (NLP) company, partner on jointly developing an advanced context-controllable Natural Language Processing (NLP) systems and a platform that will analyze human language surrounding global trends, scientific topics, concepts and entities such as public companies and cryptocurrencies. Initially, the platform will be used to extend automated feature engineering services offered by both companies based on data, meta-data and triangulated data from social media companies such as reddit.com. Additionally, the generation of alternative datasets for life sciences and the global financial markets will be available for licensing via a VXV wallet-enabled API-based dataset services business unit. Details and use cases related to the platform and resulting datasets are described in a recent reddit AMA
(Ask Me Anything) and include:
- Unsupervised learning connected to clustering of entities, equities or cryptocurrencies that have known and hidden relationships to topics, concepts, named entities, labels, symbols or global trends in research and news. These clusters can be treated as baskets, specialized ETFs, indices or sectors, short or long.
- Augmenting current clustering methods with new scored correlations connected to topics, concepts, named entities or keywords in scientific research, news or other knowledge domains such as drug development, biological, geophysical, chemical, technologies, commercial products or others.
- Detection of symbiotic, parasitic and sympathetic relationships between entities.
- Additional signal generation using time-series context-controlled sentiment scores where rows can be public companies or cryptocurrencies and columns can be labeled with custom features such as global trends in research, news, topics or concepts.
Pushshift and Vectorspace AI will also jointly augment Pushshift.io's subreddit recommendation and discovery system with advanced NLP features modeled in vector space.
The platform will also be used to augment the product offerings of Pushshift.io and Vectorspace AI including Pushshift.io API offerings and Vectorspace AI VXV wallet-enabled APIs which are currently being used by partners and collaborators including Lawrence Berkeley National Laboratory (LBNL), Google's Big Query and HangOuts groups along with Harvard's Berkman Klein Center.
About Pushshift.io (pushshift.io
Pushshift.io receives 2-5 million API calls per day connected to data from social media sites such as reddit.com. Pushshift.io APIs and data sources have been key in enabling a variety of published research papers from institutions such as Stanford, MIT Media Labs, Harvard and Princeton Universities. Selected citations include, "Community Interaction and Conflict on the Web"
, "Helping Crisis Responders Find the Informative Needle in the Tweet Haystack"
, "The Impact of Crowds on News Engagement: A Reddit Case Study"
, "Pew Research: Seven-in-ten Reddit Users Get News on the Site"
About Vectorspace AI (vectorspace.ai
Vectorspace AI offers Natural Language Processing services and context-controlled alternative datasets consisting of correlation matrices, context-controlled sentiment scoring and other automatically engineered feature attributes. These services free for acamedic purposes and also available utilizing the VXV token and "VXV wallet-enabled API" located at vectorspace.ai/recommend/datasets
- Vectorspace AI team is a spin-off from Lawrence Berkeley National Laboratory (LBNL) in the San Francisco Bay Area, California US with headquaters in Malta. The team holds patents in the area of hidden relationship detection US7987191B2