Results 1 -
2 of
2
iDiary: From GPS Signals to a Text-Searchable Diary
"... This paper describes a system that takes as input GPS data streams generated by users ’ phones and creates a searchable database of locations and activities. The system is called iDiary and turns large GPS signals collected from smartphones into textual descriptions of the trajectories. The system f ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
(Show Context)
This paper describes a system that takes as input GPS data streams generated by users ’ phones and creates a searchable database of locations and activities. The system is called iDiary and turns large GPS signals collected from smartphones into textual descriptions of the trajectories. The system features a user interface similar to Google Search that allows users to type text queries on their activities (e.g., “Where did I buy books?”) and receive textual answers based on their GPS signals. iDiary uses novel algorithms for semantic compression (known as coresets) and trajectory clustering of massive GPS signals in parallel to compute the critical locations of a user. Using an external database, we then map these locations to textual descriptions and activities so that we can apply text mining techniques on the resulting data (e.g. LSA or transportation mode recognition). We provide experimental results for both the system and algorithms and compare them to existing commercial and academic state-of-the-art. This is the first GPS system that enables text-searchable activities from GPS data.
The Single Pixel GPS: Learning Big Data Signals from Tiny Coresets
"... We present algorithms for simplifying and clustering patterns from sensors such as GPS, LiDAR, and other devices that can produce high-dimensional signals. The algorithms are suitable for handling very large (e.g. terabytes) streaming data and can be run in parallel on networks or clouds. Applicatio ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
We present algorithms for simplifying and clustering patterns from sensors such as GPS, LiDAR, and other devices that can produce high-dimensional signals. The algorithms are suitable for handling very large (e.g. terabytes) streaming data and can be run in parallel on networks or clouds. Applications include compression, denoising, activity recognition, road matching, and map generation. We encode these problems as (k, m)-segment mean problems. Formally, we provide (1 + ε)-approximations to the k-segment and (k, m)-segment mean of a d-dimensional discretetime signal. The k-segment mean is a k-piecewise linear function that minimizes the regression distance to the signal. The (k, m)-segment mean has an additional constraint that the projection of the k segments on R d consists of only m ≤ k segments. Existing algorithms for these problems take O(kn 2) and n O(mk) time respectively and O(kn 2) space, where n is the length of the signal. Our main tool is a new coreset for discrete-time signals. The coreset is a smart compression of the input signal that allows computation of a (1 + ε)-approximation to the k-segment or (k, m)-segment mean in O(n log n) time for arbitrary constants ε, k, and m. We use coresets to obtain a parallel algorithm that scans the signal in one pass, using space and update time per point that is polynomial in log n. We provide empirical evaluations of the quality of our coreset and experimental results that show how our coreset boosts both inefficient optimal algorithms and existing heuristics. We demonstrate our results for extracting signals from GPS traces. However, the results are more general and applicable to other types of sensors.