Text Embeddings
Introduction
The BOSS Modeling Framework (BMF) provides an easy to use capability for training, storing, and utilizing embeddings. Stored embeddings are known as assets
within the BOSS ecosystem.
BOSS currently utilizes gensim’s Word2Vec model for embedding generation. The Word2Vec model can be trained off of an existing VDS if the VDS has an NLP operation in the operation stack. Creating such a VDS is described within the No Code Client User Guide. Viable VDS objects will have an icon for starting an embedding training, which will kick off a training run of Word2Vec.
Parameterizing the Embedding Training
After selecting a compatible VDS and clicking create, the following screen of parameters will appear. More information about these parameters can be found in the gensim Word2Vec documentation.
Using created embeddings
Once training is complete, embeddings will show up on the Assets page in the No Code Client (check the user guide for more information). Visualizations are provided (TSNE / PCA) for interpretation of results.
Trained assets can be used with text models when starting a training run within the BOSS platform. Check the user guide for more information. Further instruction for use of assets within model python scripts can be found within the Data and Performance Analysis documentation.