Global Vectors for Word Representation — embedding

The GloVe pre-trained word vectors provide word embeddings created using varying numbers of tokens.

Usage

embedding_glove6b(
  dir = NULL,
  dimensions = c(50, 100, 200, 300),
  delete = FALSE,
  return_path = FALSE,
  clean = FALSE,
  manual_download = FALSE
)

embedding_glove27b(
  dir = NULL,
  dimensions = c(25, 50, 100, 200),
  delete = FALSE,
  return_path = FALSE,
  clean = FALSE,
  manual_download = FALSE
)

embedding_glove42b(
  dir = NULL,
  delete = FALSE,
  return_path = FALSE,
  clean = FALSE,
  manual_download = FALSE
)

embedding_glove840b(
  dir = NULL,
  delete = FALSE,
  return_path = FALSE,
  clean = FALSE,
  manual_download = FALSE
)

Source

https://nlp.stanford.edu/projects/glove/

Arguments

dir: Character, path to directory where data will be stored. If NULL, user_cache_dir will be used to determine path.
dimensions: A number indicating the number of vectors to include. One of 50, 100, 200, or 300 for glove6b, or one of 25, 50, 100, or 200 for glove27b.
delete: Logical, set TRUE to delete dataset.
return_path: Logical, set TRUE to return the path of the dataset.
clean: Logical, set TRUE to remove intermediate files. This can greatly reduce the size. Defaults to FALSE.
manual_download: Logical, set TRUE if you have manually downloaded the file and placed it in the folder designated by running this function with return_path = TRUE.

Value

A tibble with 400k, 1.9m, 2.2m, or 1.2m rows (one row for each unique token in the vocabulary) and the following variables:

token: An individual token (usually a word)
d1, d2, etc: The embeddings for that token.

Details

Citation info:

InProceedings{pennington2014glove,
author = {Jeffrey Pennington and Richard Socher and Christopher D.
Manning},
title = {GloVe: Global Vectors for Word Representation},
booktitle = {Empirical Methods in Natural Language Processing (EMNLP)},
year = 2014
pages = {1532-1543}
url = {http://www.aclweb.org/anthology/D14-1162}
}

References

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. GloVe: Global Vectors for Word Representation.

Examples

if (FALSE) {
embedding_glove6b(dimensions = 50)

# Custom directory
embedding_glove42b(dir = "data/")

# Deleting dataset
embedding_glove6b(delete = TRUE, dimensions = 300)

# Returning filepath of data
embedding_glove840b(return_path = TRUE)
}