Finding longest increasing and common subsequences in streaming data
Abstract
We present algorithms and lower bounds for the Longest Increasing Subsequence (LIS) and Longest Common Subsequence (LCS) problems in the data-streaming model. To decide if the LIS of a given stream of elements drawn from an alphabet Σ has length at least k, we discuss a one-pass algorithm using O(klog |Σ|) space, with update time either O(log k) or O(log log |Σ|); for |Σ| = O(1), we can achieve O(log k) space and constant-time updates. We also prove a lower bound of Ω2(k) on the space requirement for this problem for general alphabets Σ, even when the input stream is a permutation of Σ. For finding the actual LIS, we give a [log(1 + 1/ε)-pass algorithm using O(k1+ε log |Σ|) space, for any ε > 0. For LCS, there is a trivial Θ(1)-approximate O(log n)-space streaming algorithm when |Σ| = O(1). For general alphabet Σ, the problem is much harder. We prove several lower bounds on the LCS problem, of which the strongest is the following: it is necessary to use Ω(n/ρ2} space to approximate the LCS of two n-element streams to within a factor of ρ, even if the streams are permutations of each other. © Springer-Verlag Berlin Heidelberg 2005.