Results 1 -
8 of
8
Applying Social Network Analysis to the Information in CVS Repositories
"... The huge quantities of data available in the CVS repositories of large, long-lived libre (free, open source) software projects, and the many interrelationships among those data offer opportunities for extracting large amounts of valuable information about their structure, evolution and internal proc ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
The huge quantities of data available in the CVS repositories of large, long-lived libre (free, open source) software projects, and the many interrelationships among those data offer opportunities for extracting large amounts of valuable information about their structure, evolution and internal processes. Unfortunately, the sheer volume of that information renders it almost unusable without applying methodologies which highlight the relevant information for a given aspect of the project. In this paper, we propose the use of a well known set of methodologies (social network analysis) for characterizing libre software projects, their evolution over time and their internal structure. In addition, we show how we have applied such methodologies to real cases, and extract some preliminary conclusions from that experience.
Discussion of a large-scale open source data collection methodology
- Proceedings of the Hawaii International Conference on System Sciences (HICSS-38
, 2005
"... This paper discusses in detail a possible methodology for collecting repository data on a large number of open source software projects from a single project hosting and community site. The process of data retrieval is described along with the possible metrics that can be computed and which can be u ..."
Abstract
-
Cited by 16 (2 self)
- Add to MetaCart
This paper discusses in detail a possible methodology for collecting repository data on a large number of open source software projects from a single project hosting and community site. The process of data retrieval is described along with the possible metrics that can be computed and which can be used for further analyses. Example research areas to be addressed with the available data and first results are given. Then, both advantages and disadvantages of the proposed methodology are discussed together with implications for future approaches. 1.
How to Have A Successful Free Software Project
- In Proceedings of the 11th Asia-Pacific Software Engineering Conference
, 2004
"... Some free software projects have been extremely successful. This rise to prominence can be attributed to the high quality and suitability of the software. This quality and suitability is achieved through an elaborate peer-review process performed by a large community of users, who act as co-develope ..."
Abstract
-
Cited by 15 (3 self)
- Add to MetaCart
Some free software projects have been extremely successful. This rise to prominence can be attributed to the high quality and suitability of the software. This quality and suitability is achieved through an elaborate peer-review process performed by a large community of users, who act as co-developers to identify and correct software defects and add features. Although this process is crucial to the success of free software projects, there is more to the free software development than the creation of a ‘bazaar’. In this paper we draw on existing free software projects to define a lifecycle model for free software. This paper then explores each phase of the lifecycle model and agrees that, while the bazaar phase attracts the most attention, it is the initial modular design that accommodates diverse interventions. Moreover, it is the period of transition from the initial group to the larger community based development that is crucial in determining whether a free software project will succeed or fail. 2
Predicting Defects using Network Analysis on Dependency Graphs
"... In software development, resources for quality assurance are limited by time and by cost. In order to allocate resources effectively, managers need to rely on their experience backed by code complexity metrics. But often dependencies exist between various pieces of code over which managers may have ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
In software development, resources for quality assurance are limited by time and by cost. In order to allocate resources effectively, managers need to rely on their experience backed by code complexity metrics. But often dependencies exist between various pieces of code over which managers may have little knowledge. These dependencies can be construed as a low level graph of the entire system. In this paper, we propose to use network analysis on these dependency graphs. This allows managers to identify central program units that are more likely to face defects. In our evaluation on Windows Server 2003, we found that the recall for models built from network measures is by 10 % points higher than for models built from complexity metrics. In addition, network measures could identify 60 % of the binaries that the Windows developers considered as critical—twice as many as identified by complexity metrics. Categories and Subject Descriptors D.2.8 [Software Engineering]: Metrics—Performance measures, Process metrics, Product metrics. D.2.9 [Software Engineering]: Management—Software quality assurance (SQA)
GlueTheos: Automating the Retrieval and Analysis of Data from Publicly Available Software Repositories
- In Proceedings of the International Workshop on Mining Software Repositories
, 2004
"... For efficient, large scale data mining of publicly available information about libre (free, open source) software projects, automating the retrieval and analysis processes is a must. A system implementing such automation must have into account the many kinds of repositories with interesting informat ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
For efficient, large scale data mining of publicly available information about libre (free, open source) software projects, automating the retrieval and analysis processes is a must. A system implementing such automation must have into account the many kinds of repositories with interesting information (each with its own structure and access methods) , and the many kinds of analysis which can be applied to the retrieved data. In addition, such a system should be capable of interfacing and reusing as much existing software for both retrieving and analyzing data as possible.
Understanding knowledge sharing activities in free/open source software projects
- Journal of Systems and Software
, 2007
"... Free/Open Source Software (F/OSS) projects are people-oriented and knowledge intensive software development environments. Many researchers focused on mailing lists to study coding activities of software developers. How expert software developers interact with each other and with non-developers in th ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
Free/Open Source Software (F/OSS) projects are people-oriented and knowledge intensive software development environments. Many researchers focused on mailing lists to study coding activities of software developers. How expert software developers interact with each other and with non-developers in the use of community products have received little attention. This paper discusses the altruistic sharing of knowledge between knowledge providers and knowledge seekers in the Developer and User mailing lists of the Debian project. We analyze the posting and replying activities of the participants by counting the number of email messages they posted to the lists and the number of replies they made to questions others posted. We found out that participants interact and share their knowledge a lot, their positing activity is fairly highly correlated with their replying activity, the characteristics of posting and replying activities are different for different kinds of lists, and the knowledge sharing activity of self-organizing Free/Open Source communities could best be explained in terms of what we called ‘‘Fractal Cubic Distribution’ ’ rather than the power-law distribution mostly reported in the literature. The paper also proposes what could be researched in knowledge sharing activities in F/OSS projects mailing list and for what purpose. The research findings add to our understanding of knowledge sharing activities in F/OSS projects.
Using repository of repositories (rors) to study the growth of f/oss projects: A meta-analysis research approach
- In Third International Conference on Open Source Systems
, 2007
"... Abstract. Free/Open Source Software (F/OSS) repositories contain valuable data and their usefulness in studying software development and community activities continues to attract a lot of research attention. A trend in F/OSS studies is the use of metadata stored in a repository of repositories or Ro ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
Abstract. Free/Open Source Software (F/OSS) repositories contain valuable data and their usefulness in studying software development and community activities continues to attract a lot of research attention. A trend in F/OSS studies is the use of metadata stored in a repository of repositories or RoRs. This paper utilizes data obtained from such RoRs-FLOSSmole- to study the types of projects being developed by the F/OSS community. We downloaded projects by topics data in five areas (Database, Internet, Software Development, Communications, and Games/Entertainment) from Flossmole’s raw and summary data of the sourceforge repository. Time series analysis show the numbers of projects in the five topics are growing linearly. Further analysis supports our hypothesis that F/OSS development is moving ”up the stack ” from developer tools and infrastructure support to end-user applications such as Databases. The findings have implications for the interpretation of the F/OSS landscape, the utilization and adoption of open source databases, and problems researchers might face in obtaining and using data from RoRs.
Community structure of modules in the Apache project
- In Proceedings of the 4th Workshop on Open Source Software Engineering
, 2004
"... The relationships among modules in a software project of a certain size can give us much information about its internal organization and a way to control and monitor development activities and evolution of large libre software projects. In this paper, we show how information available in CVS reposit ..."
Abstract
- Add to MetaCart
The relationships among modules in a software project of a certain size can give us much information about its internal organization and a way to control and monitor development activities and evolution of large libre software projects. In this paper, we show how information available in CVS repositories can be used to study the structure of the modules in a project when they are related by the people working in them, and how techniques taken from the social networks fields can be used to highlight the characteristics of that structure. As a case example, we also show some results of applying this methodology to the Apache project in several points in time. Among other facts, it is shown how the project evolves and is self-structuring, with developer communities of modules corresponding to semantically related families of modules.

