Random indexing of multidimensional data

Sandin, Fredrik; Emruli, Blerim; Sahlgren, Magnus

Random indexing of multidimensional data

Mark

Sandin, Fredrik ; Emruli, Blerim ^LU and Sahlgren, Magnus (2017) In Knowledge and Information Systems 52. p.267-290

Abstract: Random indexing (RI) is a lightweight dimension reduction method, which is used, for example, to approximate vector semantic relationships in online natural language processing systems. Here we generalise RI to multidimensional arrays and therefore enable approximation of higher-order statistical relationships in data. The generalised method is a sparse implementation of random projections, which is the theoretical basis also for ordinary RI and other randomisation approaches to dimensionality reduction and data representation. We present numerical experiments which demonstrate that a multidimensional generalisation of RI is feasible, including comparisons with ordinary RI and principal component analysis. The RI method is well suited for... (More); Random indexing (RI) is a lightweight dimension reduction method, which is used, for example, to approximate vector semantic relationships in online natural language processing systems. Here we generalise RI to multidimensional arrays and therefore enable approximation of higher-order statistical relationships in data. The generalised method is a sparse implementation of random projections, which is the theoretical basis also for ordinary RI and other randomisation approaches to dimensionality reduction and data representation. We present numerical experiments which demonstrate that a multidimensional generalisation of RI is feasible, including comparisons with ordinary RI and principal component analysis. The RI method is well suited for online processing of data streams because relationship weights can be updated incrementally in a fixed-size distributed representation, and inner products can be approximated on the fly at low computational cost. An open source implementation of generalised RI is provided. (Less)

Please use this url to cite or link to this publication: https://lup.lub.lu.se/record/a7ede0c1-5674-4115-b368-27564a26b128

author

Sandin, Fredrik ; Emruli, Blerim ^LU and Sahlgren, Magnus

publishing date

2017

type

Contribution to journal

publication status

published

subject

Information Systems, Social aspects (including Human Aspects of ICT)

in

Knowledge and Information Systems

volume

52

pages

267 - 290

external identifiers

scopus:85001755138

ISSN

0219-3116

DOI

10.1007/s10115-016-1012-2

language

English

LU publication?

no

id

a7ede0c1-5674-4115-b368-27564a26b128

date added to LUP

2025-03-31 21:26:59

date last changed

2025-10-14 09:12:07

@article{a7ede0c1-5674-4115-b368-27564a26b128,
  abstract     = {{Random indexing (RI) is a lightweight dimension reduction method, which is used, for example, to approximate vector semantic relationships in online natural language processing systems. Here we generalise RI to multidimensional arrays and therefore enable approximation of higher-order statistical relationships in data. The generalised method is a sparse implementation of random projections, which is the theoretical basis also for ordinary RI and other randomisation approaches to dimensionality reduction and data representation. We present numerical experiments which demonstrate that a multidimensional generalisation of RI is feasible, including comparisons with ordinary RI and principal component analysis. The RI method is well suited for online processing of data streams because relationship weights can be updated incrementally in a fixed-size distributed representation, and inner products can be approximated on the fly at low computational cost. An open source implementation of generalised RI is provided.}},
  author       = {{Sandin, Fredrik and Emruli, Blerim and Sahlgren, Magnus}},
  issn         = {{0219-3116}},
  language     = {{eng}},
  pages        = {{267--290}},
  series       = {{Knowledge and Information Systems}},
  title        = {{Random indexing of multidimensional data}},
  url          = {{http://dx.doi.org/10.1007/s10115-016-1012-2}},
  doi          = {{10.1007/s10115-016-1012-2}},
  volume       = {{52}},
  year         = {{2017}},
}

Lund University Publications

LUND UNIVERSITY LIBRARIES

Random indexing of multidimensional data