Results 1 - 10
of
18
A comparison of sources of Links for academic Web Impact Factor Calculations
- Journal of Documentation
, 2002
"... There has been much recent interest in extracting information from collections of web links. One tool that has been used is Ingwersen’s Web Impact Factor. It has been demonstrated that several versions of this metric can produce results that correlate with research ratings of British universities sh ..."
Abstract
-
Cited by 18 (10 self)
- Add to MetaCart
There has been much recent interest in extracting information from collections of web links. One tool that has been used is Ingwersen’s Web Impact Factor. It has been demonstrated that several versions of this metric can produce results that correlate with research ratings of British universities showing that, despite being a measure of a purely Internet phenomenon, the results are susceptible to a wider interpretation. This paper addresses the question of which is the best possible domain to count backlinks from, if research is the focus of interest. WIFs for British universities calculated from several different source domains are compared, primarily the.edu,.ac.uk and.uk domains, and the entire web. The results show that all four areas produce WIFs that correlate strongly with research ratings, but that none produce incontestably superior figures. It was also found that the WIF was less able to differentiate in more homogenous subsets of universities, although positive results are still possible.
Google Scholar citations and Google Web/URL citations: A multi-discipline exploratory analysis
- Journal of the American Society for Information Science and Technology
, 2007
"... In this paper we introduce a new data gathering method “Web/URL Citation ” and use it and Google Scholar as a basis to compare traditional and Web-based citation patterns across multiple disciplines. For this, we built a sample of 1,650 articles from 108 Open Access (OA) journals published in 2001 i ..."
Abstract
-
Cited by 15 (4 self)
- Add to MetaCart
In this paper we introduce a new data gathering method “Web/URL Citation ” and use it and Google Scholar as a basis to compare traditional and Web-based citation patterns across multiple disciplines. For this, we built a sample of 1,650 articles from 108 Open Access (OA) journals published in 2001 in four science and four social science disciplines. We recorded the number of citations to the sample articles using several methods based upon the ISI Web of Science, Google Scholar and the Google search engine (Web/URL citations). For each discipline, we found significant correlations between ISI citations and both Google Scholar and Google Web/URL citations; with similar results when using total or average citations, and when comparing within and across (most) journals. We also investigated disciplinary differences. Google Scholar citations were more numerous than ISI citations in our four social science disciplines as well as in computer science, suggesting that Google Scholar is a more comprehensive tool for citation tracking in the social sciences and perhaps also in fast-moving fields where conference papers are highly valued and published online. The results for Web/URL citations suggested that counting a maximum of one hit per site produces a better measure for assessing the impact of OA journals or articles, because replicated web citations are very common within individual sites. The results can be considered as additional evidence that there is some commonality between traditional and Web-extracted citations. 1.
Three target document range metrics for university Web sites
- Journal of the American Society for Information Science and Technology
, 2003
"... Three new metrics are introduced that measure the range of use of a university Web site by its peers through different heuristics for counting links targeted at its pages. All three give results that correlate significantly with the research productivity of the target institution. The directory rang ..."
Abstract
-
Cited by 14 (7 self)
- Add to MetaCart
Three new metrics are introduced that measure the range of use of a university Web site by its peers through different heuristics for counting links targeted at its pages. All three give results that correlate significantly with the research productivity of the target institution. The directory range model, which is based upon summing the number of distinct directories targeted by each other university, produces the most promising results of any link metric yet. Based upon an analysis of changes between models, it is suggested that range models measure essentially the same quantity as their predecessors but are less susceptible to spurious causes of multiple links and are therefore more robust.
Methodologies for Crawler Based Web Surveys
, 2002
"... There have been many attempts to study the content of the web, either through human or automatic agents. Five different previously used web survey methodologies are described and analysed, each justifiable in its own right, but a simple experiment is presented that demonstrates concrete differences ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
There have been many attempts to study the content of the web, either through human or automatic agents. Five different previously used web survey methodologies are described and analysed, each justifiable in its own right, but a simple experiment is presented that demonstrates concrete differences between them. The concept of crawling the web also bears further inspection, including the scope of the pages to crawl, the method used to access and index each page, and the algorithm for the identification of duplicate pages. The issues involved here will be well-known to many computer scientists but, with the increasing use of crawlers and search engines in other disciplines, they now require a public discussion in the wider research community. This paper concludes that any scientific attempt to crawl the web must make available the parameters under which it is operating so that researchers can, in principle, replicate experiments or be aware of and take into account differences between methodologies. A new hybrid random page selection methodology is also introduced.
The Connections between the Research of a University and Counts of Links to Its Web Pages: An Investigation Based Upon a Classification of the Relationships of Pages to the Research of the Host University
- Journal of the American Society for Information Science and Technology
, 2002
"... This paper uses a page categorization in order to show that restricting the metrics to subsets more closely related to the research of the host university can produce even stronger associations. A partial overlap was also found between the effects of applying advanced document models and separating ..."
Abstract
-
Cited by 13 (6 self)
- Add to MetaCart
This paper uses a page categorization in order to show that restricting the metrics to subsets more closely related to the research of the host university can produce even stronger associations. A partial overlap was also found between the effects of applying advanced document models and separating page types, but the best results were achieved through a combination of the two
Interpreting social science link analysis research: A theoretical framework
- Journal of the American Society for Information Science and Technology
, 2006
"... Link analysis in various forms is now an established technique in many different subjects, reflecting the perceived importance of links and that of the web. A critical but very difficult issue is how to interpret the results of social science link analyses. It is argued that the dynamic nature of th ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Link analysis in various forms is now an established technique in many different subjects, reflecting the perceived importance of links and that of the web. A critical but very difficult issue is how to interpret the results of social science link analyses. It is argued that the dynamic nature of the web, its lack of quality control and the online proliferation of copying and imitation mean that methodologies operating within a highly positivist, quantitative framework are ineffective. Conversely, the sheer variety of the web makes qualitative methodologies and pure reason very problematic to apply to large-scale studies. Methodology triangulation is consequently advocated, in combination with a warning that the web is incapable of giving definitive answers to large-scale link analysis research questions concerning social factors underlying link creation. Finally, it is claimed that whilst theoretical frameworks with which to guide research are appropriate, a Theory of Link Analysis is not possible.
A layered approach for investigating the topological structure of communities in the Web
- Journal of Documentation
, 2003
"... A layered approach for identifying communities in the Web is presented and explored by applying the Flake Exact Community Identification Algorithm to the UK academic Web. Although community or topic identification is a common task in information retrieval, a new perspective is developed by: (a) the ..."
Abstract
-
Cited by 5 (4 self)
- Add to MetaCart
A layered approach for identifying communities in the Web is presented and explored by applying the Flake Exact Community Identification Algorithm to the UK academic Web. Although community or topic identification is a common task in information retrieval, a new perspective is developed by: (a) the application of Alternative Document Models, shifting the focus from individual pages to aggregated collections based upon Web directories, domains and entire sites; (b) the removal of internal site links; and (c) the adaptation of a new fast algorithm to allow fully automated community identification using all possible single starting points. The overall topology of the graphs in the three least aggregated layers was first investigated and found to include a large number of isolated points but, surprisingly, with most of the remainder being in one huge connected component, exact proportions varying by layer. The community identification process then found that the number of communities far exceeded the number of topological components, indicating that community identification is a potentially useful technique, even with random starting points. Both the number and size of communities identified was dependant on the parameter of the algorithm, with very different results being obtained in each case. In conclusion, the UK academic Web is embedded with layers of non-trivial communities and, if it is not unique in this, then there is the promise of (a) improved results for information retrieval algorithms that can exploit this additional structure, and (b) the application of the technique directly to partially automate Web metrics tasks such as that of finding all pages related to a given subject hosted by a single country’s universities.
Methodologies for Crawler Based Web
- Surveys, Internet Research: Electronic Networking and Applications
, 2002
"... There have been many attempts to study the content of the Web, either through human or automatic agents. Describes five different previously used Web survey methodologies, each justifiable in its own right, but presents a simple experiment that demonstrates concrete differences between them. The con ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
There have been many attempts to study the content of the Web, either through human or automatic agents. Describes five different previously used Web survey methodologies, each justifiable in its own right, but presents a simple experiment that demonstrates concrete differences between them. The concept of crawling the Web also bears further inspection, including the scope of the pages to crawl, the method used to access and index each page, and the algorithm for the identification of duplicate pages. The issues involved here will be well-known to many computer scientists but, with the increasing use of crawlers and search engines in other disciplines, they now require a public discussion in the wider research community. Concludes that any scientific attempt to crawl the Web must make available the parameters under which it is operating so that researchers can, in principle, replicate experiments or be aware of and take into account differences between methodologies. Also introduces a new hybrid random page selection methodology. Electronic access The research register for this journal is available at

