We’re team Vectorspace AI and here to talk about datasets based on human language and how they can contribute to scientific discovery.
What do we do?
In general terms, we add structure to unstructured data for unsupervised Machine Learning (ML) systems. Not very glamorous or even interesting to many, but you might liken it to the glue that binds data and semi-intelligent systems.
More specifically, we build datasets and augment existing datasets with additional 'signal' for the purpose of minimizing a loss function. We do this by generating context-controlled correlation matrices. The correlation scores are derived from machine & human language processed in vector space via labeled embeddings (LBNL 2005, Google 2010).
More: https://vectorspace.ai/reddit-ama.html