Skip to main content

Lund University Publications

LUND UNIVERSITY LIBRARIES

Random indexing of multidimensional data

Sandin, Fredrik ; Emruli, Blerim LU and Sahlgren, Magnus (2017) In Knowledge and Information Systems 52. p.267-290
Abstract
Random indexing (RI) is a lightweight dimension reduction method, which is used, for example, to approximate vector semantic relationships in online natural language processing systems. Here we generalise RI to multidimensional arrays and therefore enable approximation of higher-order statistical relationships in data. The generalised method is a sparse implementation of random projections, which is the theoretical basis also for ordinary RI and other randomisation approaches to dimensionality reduction and data representation. We present numerical experiments which demonstrate that a multidimensional generalisation of RI is feasible, including comparisons with ordinary RI and principal component analysis. The RI method is well suited for... (More)
Random indexing (RI) is a lightweight dimension reduction method, which is used, for example, to approximate vector semantic relationships in online natural language processing systems. Here we generalise RI to multidimensional arrays and therefore enable approximation of higher-order statistical relationships in data. The generalised method is a sparse implementation of random projections, which is the theoretical basis also for ordinary RI and other randomisation approaches to dimensionality reduction and data representation. We present numerical experiments which demonstrate that a multidimensional generalisation of RI is feasible, including comparisons with ordinary RI and principal component analysis. The RI method is well suited for online processing of data streams because relationship weights can be updated incrementally in a fixed-size distributed representation, and inner products can be approximated on the fly at low computational cost. An open source implementation of generalised RI is provided. (Less)
Please use this url to cite or link to this publication:
author
; and
publishing date
type
Contribution to journal
publication status
published
subject
in
Knowledge and Information Systems
volume
52
pages
267 - 290
external identifiers
  • scopus:85001755138
ISSN
0219-3116
DOI
10.1007/s10115-016-1012-2
language
English
LU publication?
no
id
a7ede0c1-5674-4115-b368-27564a26b128
date added to LUP
2025-03-31 21:26:59
date last changed
2025-04-04 14:07:45
@article{a7ede0c1-5674-4115-b368-27564a26b128,
  abstract     = {{Random indexing (RI) is a lightweight dimension reduction method, which is used, for example, to approximate vector semantic relationships in online natural language processing systems. Here we generalise RI to multidimensional arrays and therefore enable approximation of higher-order statistical relationships in data. The generalised method is a sparse implementation of random projections, which is the theoretical basis also for ordinary RI and other randomisation approaches to dimensionality reduction and data representation. We present numerical experiments which demonstrate that a multidimensional generalisation of RI is feasible, including comparisons with ordinary RI and principal component analysis. The RI method is well suited for online processing of data streams because relationship weights can be updated incrementally in a fixed-size distributed representation, and inner products can be approximated on the fly at low computational cost. An open source implementation of generalised RI is provided.}},
  author       = {{Sandin, Fredrik and Emruli, Blerim and Sahlgren, Magnus}},
  issn         = {{0219-3116}},
  language     = {{eng}},
  pages        = {{267--290}},
  series       = {{Knowledge and Information Systems}},
  title        = {{Random indexing of multidimensional data}},
  url          = {{http://dx.doi.org/10.1007/s10115-016-1012-2}},
  doi          = {{10.1007/s10115-016-1012-2}},
  volume       = {{52}},
  year         = {{2017}},
}