Sparse p-adic data coding for computationally efficient and effective big data analytics

Journal article


Murtagh, Fionn 2016. Sparse p-adic data coding for computationally efficient and effective big data analytics. P-Adic Numbers, Ultrametric Analysis, and Applications. https://doi.org/10.1134/S2070046616030055
AuthorsMurtagh, Fionn
Abstract

We develop the theory and practical implementation of p-adic sparse coding of data. Rather than the standard, sparsifying criterion that uses the $L_0$ pseudo-norm, we use the p-adic norm. We require that the hierar- chy or tree be node-ranked, as is standard practice in agglomerative and other hierarchical clustering, but not necessarily with decision trees. In order to structure the data, all computational processing operations are direct reading of the data, or are bounded by a constant number of direct readings of the data, implying linear computational time. Through p-adic sparse data coding, effi cient storage results, and for bounded p-adic norm stored data, search and retrieval are constant time operations. Examples show the e ffectiveness of this new approach to content-driven encoding and displaying of data.

We develop the theory and practical implementation of p-adic sparse
coding of data. Rather than the standard, sparsifying criterion that uses
the $L_0$ pseudo-norm, we use the p-adic norm. We require that the hierar-
chy or tree be node-ranked, as is standard practice in agglomerative and
other hierarchical clustering, but not necessarily with decision trees. In
order to structure the data, all computational processing operations are
direct reading of the data, or are bounded by a constant number of direct
readings of the data, implying linear computational time. Through p-adic
sparse data coding, effi cient storage results, and for bounded p-adic norm
stored data, search and retrieval are constant time operations. Examples
show the e ffectiveness of this new approach to content-driven encoding
and displaying of data.

KeywordsBig data; P-adic numbers; Ultrametric topology; Hierarchical clustering; Binary rooted tree; Computational and storage complexity
Year2016
JournalP-Adic Numbers, Ultrametric Analysis, and Applications
PublisherPleiades Publishing Ltd. (Springer)
ISSN2070-0466
2070-0474
Digital Object Identifier (DOI)https://doi.org/10.1134/S2070046616030055
Web address (URL)http://hdl.handle.net/10545/619218
http://creativecommons.org/licenses/by-nc-nd/4.0/
hdl:10545/619218
Publication dates14 Aug 2016
Publication process dates
Deposited01 Sep 2016, 11:02
Rights

Archived with thanks to P-Adic Numbers, Ultrametric Analysis, and Applications

ContributorsDepartment of Computing and Mathematics, Big Data Lab, University of Derby
File
File Access Level
Open
File
File Access Level
Open
File
File Access Level
Open
Permalink -

https://repository.derby.ac.uk/item/93895/sparse-p-adic-data-coding-for-computationally-efficient-and-effective-big-data-analytics

Download files

  • 5
    total views
  • 2
    total downloads
  • 0
    views this month
  • 0
    downloads this month

Export as