Results 1 -
9 of
9
Scalable clustering algorithms with balancing constraints
- Data Mining Knowledge Discovery
"... Abstract. Clustering methods for data-mining problems must be extremely scalable. In addition, several data mining applications demand that the clusters obtained be balanced, i.e., of approximately the same size or importance. In this paper, we propose a general framework for scalable, balanced clus ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
Abstract. Clustering methods for data-mining problems must be extremely scalable. In addition, several data mining applications demand that the clusters obtained be balanced, i.e., of approximately the same size or importance. In this paper, we propose a general framework for scalable, balanced clustering. The data clustering process is broken down into three steps: sampling of a small representative subset of the points, clustering of the sampled data, and populating the initial clusters with the remaining data followed by refinements. First, we show that a simple uniform sampling from the original data is sufficient to get a representative subset with high probability. While the proposed framework allows a large class of algorithms to be used for clustering the sampled set, we focus on some popular parametric algorithms for ease of exposition. We then present algorithms to populate and refine the clusters. The algorithm for populating the clusters is based on a generalization of the stable marriage problem, whereas the refinement algorithm is a constrained iterative relocation scheme. The complexity of the overall method is O(kN log N) for obtaining k balanced clusters from N data points, which compares favorably with other existing techniques for balanced clustering. In addition to providing balancing guarantees, the clustering performance obtained using the proposed framework is comparable to and often better than the corresponding unconstrained solution. Experimental results on several datasets, including
Attribute-Based Clustering for Information Dissemination in Wireless Sensor Networks
- in The 2nd Annual IEEE Communications Society Conf. on Sensor and Ad Hoc Communications and Networks (SECON'05
, 2005
"... Abstract – Data routing in large wireless sensor networks is challenged by requirements for scalability, robustness, and energy use, as well as the possibility that globally unique identifiers for each node are nonexistent. Moreover, if the sensor network serves multiple functions or users, its rout ..."
Abstract
-
Cited by 8 (3 self)
- Add to MetaCart
Abstract – Data routing in large wireless sensor networks is challenged by requirements for scalability, robustness, and energy use, as well as the possibility that globally unique identifiers for each node are nonexistent. Moreover, if the sensor network serves multiple functions or users, its routing function must be energy efficient when responding to information requests (inquiries) that can arrive at high rates, require diverse data types, and target subsets of the sensor network. We propose virtual “containment ” hierarchies based on relevant attributes as a more efficient mechanism to disseminate inquiries rather than the use of flooding schemes. We show algorithms that support clustering sensors according to these hierarchies and implement clusterhead failure recovery and load balancing among cluster members. We show that our framework has significant bandwidth gains over a flooding-based scheme under the scenario considered using analytical techniques.
PANEL: Position-based Aggregator Node Election in Wireless Sensor Networks
"... In this paper, we introduce PANEL, a position-based aggregator node election protocol for wireless sensor networks. The novelty of PANEL with respect to other aggregator node election protocols is that it supports asynchronous sensor network applications where the sensor readings are fetched by the ..."
Abstract
-
Cited by 6 (1 self)
- Add to MetaCart
In this paper, we introduce PANEL, a position-based aggregator node election protocol for wireless sensor networks. The novelty of PANEL with respect to other aggregator node election protocols is that it supports asynchronous sensor network applications where the sensor readings are fetched by the base stations after some delay. In particular, the motivation for the design of PANEL was to support reliable and persistent data storage applications, such as TinyPEDS [13]. PANEL ensures load balancing, and it supports intraand inter-cluster routing allowing sensor to aggregator, aggregator to aggregator, base station to aggregator, and aggregator to base station communications. We also compare PANEL with HEED [42] in the simulation environment provided by TOSSIM, and show that, on the one hand, PANEL creates more cohesive clusters than HEED, and, on the other hand, that PANEL is more energy efficient than HEED.
ABSTRACT Constraint-Driven Clustering
"... Clustering methods can be either data-driven or need-driven. Data-driven methods intend to discover the true structure of the underlying data while need-driven methods aims at organizing the true structure to meet certain application requirements. Thus, need-driven (e.g. constrained) clustering is a ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Clustering methods can be either data-driven or need-driven. Data-driven methods intend to discover the true structure of the underlying data while need-driven methods aims at organizing the true structure to meet certain application requirements. Thus, need-driven (e.g. constrained) clustering is able to find more useful and actionable clusters in applications such as energy aware sensor networks, privacy preservation, and market segmentation. However, the existing methods of constrained clustering require users to provide the number of clusters, which is often unknown in advance, but has a crucial impact on the clustering result. In this paper, we argue that a more natural way to generate actionable clusters is to let the application-specific constraints decide the number of clusters. For this purpose, we introduce a novel cluster model, Constraint-Driven Clustering (CDC), which finds an a priori unspecified number of compact clusters that satisfy all user-provided constraints. Two general types of constraints are considered, i.e. minimum significance constraints and minimum variance constraints, as well as combinations of these two types. We prove the NP-hardness of the CDC problem with different constraints. We propose a novel dynamic data structure, the CD-Tree, which organizes data points in leaf nodes such that each leaf node approximately satisfies the CDC constraints and minimizes the objective function. Based on CD-Trees, we develop an efficient algorithm to solve the new clustering problem. Our experimental evaluation on synthetic and real datasets demonstrates the quality of the generated clusters and the scalability of the algorithm.
An Adaptive Data Dissemination Strategy for Wireless Sensor Networks ^
"... Future large-scale sensor networks may comprise thousands of wirelessly connected sensor nodes that could provide an unimaginable opportunity to interact with physical phenomena in real time. However, the nodes are typically highly resource-constrained. Since the communication task is a significant ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Future large-scale sensor networks may comprise thousands of wirelessly connected sensor nodes that could provide an unimaginable opportunity to interact with physical phenomena in real time. However, the nodes are typically highly resource-constrained. Since the communication task is a significant power consumer, there are various attempts to introduce energy-awareness at different levels within the communication stack. Clustering is one such attempt to control energy dissipation for sensor data dissemination in a multihop fashion. The Time-Controlled Clustering Algorithm (TCCA) was proposed to realize a network-wide energy reduction. A realistic energy dissipation model is derived probabilistically to quantify the sensor network’s energy consumption using the proposed clustering algorithm. A discrete-event simulator is developed to verify the mathematical model and to further investigate TCCA in other scenarios. The simulator is also extended to include the rest of the communication stack to allow a comprehensive evaluation of the proposed algorithm.
Genetic Algorithm Based Node Placement Methodology For Wireless Sensor Networks
"... Abstract — A Genetic Algorithm based multi-objective methodology was implemented for a self-organizing wireless sensor network. Design parameters such as network density, connectivity and energy consumption are taken into account for developing the fitness function. The genetic algorithm optimizes t ..."
Abstract
- Add to MetaCart
Abstract — A Genetic Algorithm based multi-objective methodology was implemented for a self-organizing wireless sensor network. Design parameters such as network density, connectivity and energy consumption are taken into account for developing the fitness function. The genetic algorithm optimizes the operational modes of the sensor nodes along with clustering schemes and transmission signal strengths. The algorithm has been implemented in MATLAB using its Genetic Algorithm toolbox along with custom codes. The optimal designs so achieved by the algorithm conform to all the design parameters.
ADAPTIVE ATTRIBUTE-BASED ROUTING IN CLUSTERED WIRELESS SENSOR NETWORKS
"... I would like to thank first my advisor Prof. Thomas D. C. Little, without whose support and acceptance I would not have been able to even embark in this long journey. His support and patience while I explored different paths along the road helped me mature into someone who can determine his own rese ..."
Abstract
- Add to MetaCart
I would like to thank first my advisor Prof. Thomas D. C. Little, without whose support and acceptance I would not have been able to even embark in this long journey. His support and patience while I explored different paths along the road helped me mature into someone who can determine his own research direction and goals. I am indebted to my former colleague, Prithwish Basu, who taught me the nitty gritty details of how to be a PhD student. His going forth ahead of me helped me see how the road can be trodden. My colleagues Salma Abu Ayyash and Ashish Aggarwal contributed in discussion, ideas and encouragement as fellow sojourners. I am deeply indebted to my parents, without whose daily love, support and encouragement I would not have been able to take any steps in life, much less in this doctorate program. I am deeply thankful to all my brothers and sisters in Christ in my church. They are the family I have found in this foreign land, who made me feel at home, and have been with me ever since I arrived here as a young, inexperienced and immature graduate student. Their love helped me grow as a person, and for that I am deeply indebted. I want to thank my fiance, for joining me in this journey,

