Dataset Builder: NLP-based context-controlled correlation matrices for unsupervised learning

  • Create unique NYSE, Nasdaq, cryptocurrency, ETF & OTC datasets
  • Engineer custom feature vectors
  • Explore trends & generate alpha


Data context: S&P 500 + General Data



Enter 1 to 5 concepts or keywords

example: Batteries, Bioengineering, Graphene, Blockchain, Machine Learning



Equity Type:      Month:   Year:
Data Context:



What kind of things can be done with custom concept columns & features?

  • Create unique sectors or clusters based on concepts and hidden relationships and compare their gains to the S&P (see below)
  • Determine if price correlations have similar concept or keyword correlations
  • Examine symbiotic, parasitic and sympathetic relationships between equities
  • Automatically create baskets of stocks based on concepts and/or keywords
  • Detach the custom columns and append them to other proprietary inhouse datasets
  • Select a Data Context (e.g. Biological, Chemical, GeoPhysical and others) to derive different signals
  • Use stock symbols as custom concept column labels and model cross-correlations between equities
  • Create features using trending terms anywhere on the internet


    How do the concepts & trends correlate to crypto, stocks or ETFs?

    Scores range from 0 to 1 and represent strength of known and hidden relationships between a concept and a stock, option or ETF. The score is calculated based on a series of algorithms that monitor data surrounding each company associated to the underlying security where each score is combined with scores from human curation teams. These concepts can then be factored or parameterized for exploring new signals or building new models.



  • Data reqeusts: info@vectorspace.ai

    Dataset & feature requests: