What is word embedding in deep learning?

A plot of word embeddings in English and German. The semantic equivalence of words has been inferred by their context, so similar meanings are co-located. This is because the relative semantics of words are consistent, whatever the language. [Source: Luong et al]

A plot of word embeddings in English and German. The semantic equivalence of words has been inferred by their context, so similar meanings are co-located. This is because the relative semantics of words are consistent, whatever the language. [Source: Luong et al]

 

The Big Idea of Word Embedding is to turn text into numbers.


Creating Word Embeddings is the basic step to working with textual data because computers and other devices don’t understand the text – they largely work with numbers when it comes to detection and recognition.

So a natural language modelling technique like Word Embedding is used to map words or phrases from a vocabulary to a corresponding vector of real numbers. As well as being amenable to processing by learning algorithms, this vector representation has two important and advantageous properties:

  • Dimensionality Reduction — it is a more efficient representation
  • Contextual Similarity — it is a more expressive representation

Fundamental Idea:

The main idea here is that every word can be converted to a set of numbers – N-dimensional vector. Although every word gets assigned a unique vector/embedding, similar words end up having values closer to each other. For example, the vectors for the words ‘Woman’ and ‘Girl’ would have a higher similarity than the vectors for ‘Girl’ and ‘Cat’.

For these numerical representations to be really useful, the goal is to capture meanings, semantic relationships, similarities between words, and the context of different words as they are used naturally by humans.

The meaning of a word can be captured, to some extent, by its use with other words. For example, ‘food’ and ‘hungry’ are more likely to be used in the same context than the words ‘hungry’ and ‘software’.

The idea is that given any two words if these two words have a similar meaning, they are likely to have similar context words. And this is used as the basis of the training algorithms for word embeddings.


Word Embeddings techniques:

There are many techniques to create Word Embeddings. Some of the popular ones are:

  1. Binary Encoding.
  2. TF Encoding.
  3. TF-IDF Encoding.
  4. Latent Semantic Analysis Encoding.
  5. Word2Vec Encoding.

Applications:

Healthcare:

Take, for example, one of the biggest challenges faced by health-tech today – how to integrate HIMS (Hospital Information Management System) and EHR (Electronic Health Records)? How to feed this integration into the CDS (Clinical Decision Support) of hospitals? And finally, how to automate the process of generating accurate results from CDS – both diagnostic and prescriptive? Or, another pressing problem faced by 21st-century healthcare management – how can feature selection of disease symptoms be used for epidemic surveillance (Bird flu or H1N1 for example)?

Taxonomy:

Taxonomies are pivotal to knowledge management and organization and serve as the foundation for superior representations of knowledge in various systems, such as formal ontologies. Since developing taxonomies by humans is cumbersome and expensive, automation of taxonomy induction to build taxonomies at scale requires recognizing words and word patterns in context.

Financial News:

An industry that is highly sensitive to news announcements and press releases, modern technology is being trained to delve into understanding the sentiment of financial news even as we speak, to detect and depict market bearings.