Fast and Space-Efficient Location of Heavy or Dense Segments in Run-Length Encoded Sequences

Research output: Contribution to journalArticlepeer-review

Abstract

This paper considers several variations of an optimization problem with potential applications in such areas as biomolecular sequence analysis and image processing. Given a sequence of items, each with a weight and a length, the goal is to find a subsequence of consecutive items of optimal value, where value is either total weight or total weight divided by total length. There may also be a specified lower and/or upper bound on the acceptable length of subsequences. This paper shows that all the variations of the problem are solvable in linear time and space even with non-uniform item lengths and divisible items, implying that run-length encoded sequences can be handled in time and space linear in the number of runs. Furthermore, some problem variations can be solved in constant space. Also, these time and space bounds suffice for certain problem variations in which we call for reporting of many “good” subsequences.

Original languageAmerican English
JournalComputer Science: Faculty Publications and Other Works
Volume2697 of Lecture Notes in Computer Science
DOIs
StatePublished - Jul 1 2003

Keywords

  • maximum consecutive subsequence sum
  • maximum-density segments
  • biomolecular sequence analysis
  • bioinformatics
  • image processing
  • data compression

Disciplines

  • Bioinformatics
  • Computer Sciences
  • Theory and Algorithms

Cite this