Comics

8 pages
7 views

Mapping world-class universities on the web

Please download to get full document.

View again

of 8
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Share
Description
Mapping world-class universities on the web
Transcript
  Mapping world-class universities on the web  Jose Luis Ortega * , Isidro F. Aguillo Cybermetrics Lab, IEDCYT-CSIC, Joaquín Costa, 22, 28002 Madrid, Spain a r t i c l e i n f o  Article history: Received 13 June 2008Received in revised form 15 October 2008Accepted 16 October 2008Available online 29 November 2008 Keywords: Information visualizationSocial network analysisWebometricsUniversity web a b s t r a c t A visual display of the most important universities in the world is the aim of this paper. Itshows the topological characteristics and describes the web relationships among universi-ties of different countries and continents. The first 1000 higher education institutions fromthe Ranking Web of World Universities were selected and their link relationships wereobtained from Yahoo! Search. Network graphs and geographical maps were built fromthe search engine data. Social network analysis techniques were used to analyse anddescribe the structural properties of the whole of the network and its nodes. The resultsshow that the world-class university network is constituted from national sub-networksthat merge in a central core where the principal universities of each country pull their net-works toward international link relationships. The United States dominates the world net-work, and within Europe the British and the German sub-networks stand out.   2008 Elsevier Ltd. All rights reserved. 1. Introduction The World Wide Web has become a key medium for promoting and developing the academic, scientific and educationalcompetences of a university. E-learning programs and open access initiatives allow knowledge of these institutions to spreadbeyondphysical boundaries. The Web can hencebe used as a wayto attract students, scholarsandfunding fromotherplaces,spreading the prestige of these educational institutions all over the world. This has provoked competition between univer-sities to achieve an advantageous visibility on the Web and to improve their position in search engine results.Web performance has been analysed from different points of view. Web data have been used as an indicator of the edu-cational and scientific activity developed on the Web, relating web indicators with academic outputs (Smith, 2008; Thelwall,2002a; Thelwall & Harries, 2003, 2004) or bibliometric indicators (Aguillo, Granadino, Ortega, & Prieto, 2006). Visualization of Information (Chen, 2003) has also been a suitable tool for mapping university linkages and showing visual relationshipsaccording to several variables. The first attempts used multivariate analysis to plot and group universities (Polanco, Boudou-rides, Besagni, & Roche, 2001; Vaughan, 2006). Now, Network Analysis offers additional structural and visual possibilities. Heimeriks and Van Den Besselaar (2006) used these analysis techniques to detect four geographical zones in the EuropeanUnion (EU) academic web space: Scandinavia, UK, Germany and South Europe. Similar results were obtained by Ortega,Aguillo, Cothey, and Scharnhorst (2008), finding that European universities are grouped in local or national sub-networks which are connected with other sub-networks for linguistic or geographical reasons (Thelwall, 2002b; Thelwall, Tang, &Price, 2003). Lately, Thelwall and Zuccala (2008) have studied the link relationship between universities and national web spaces in Europe, describing the European web relationships at the country level.All these studies were focused on countries such as Spain (Ortega & Aguillo, 2007; Thelwall & Aguillo, 2003) Canada (Vaughan, 2006; Vaughan & Thelwall, 2005) or regions such as the EU (Heimeriks et al., 2006; Ortega et al., 2008; Thelwall & Zuccala, 2008) or Scandinavia (Ortega & Aguillo, 2008a). However, mapping universities at a global level and with a large and consistent population has not been attempted. 0306-4573/$ - see front matter    2008 Elsevier Ltd. All rights reserved.doi:10.1016/j.ipm.2008.10.001 *  Corresponding author. Tel.: +34 916022890; fax: +34 916022971. E-mail addresses:  jortega@orgc.csic.es (J.L. Ortega), isidro@cindoc.csic.es (I.F. Aguillo). Information Processing and Management 45 (2009) 272–279 Contents lists available at ScienceDirect Information Processing and Management journal homepage: www.elsevier.com/locate/infoproman  2. Objectives The purpose of this paper is to present a visual display of the 1000 most important universities in the World according tothe Ranking Web of World Universities (www.webometrics.info). This map intends to show the topological characteristics of this network and to describe the relationships among universities of different countries and continents. We also present,through network analysis techniques, the most important universities in the network structure, the gateway universitiesthat connect different web spaces or sub-networks and the network core. 3. Methods  3.1. Data extraction We have selected the first 1000 higher education institutions from the Ranking Web of World Universities. This rankingorders the universities according to four main web characteristics from their institutional web domain. The volume of con-tents is measured by the number of pages freely accessible, their visibility by the number of incoming links. The number of rich files is used as an indicator because rich files are a format to spread scientific and technical data and results. And totalnumber of document indexed in Google Scholar is an indicator of the scientific publications on the Web. Each web domain isranked by the linear aggregation of these indicators, building the webometrics ranking (WR) WR   ¼  4 ð visibility rank Þ þ  2 ð site rank Þ þ  1 ð Rich file rank Þ þ  1 ð Google Scholar rank Þ More information about the methodology of Ranking Web of World Universities is available in About the Ranking(www.webometrcis.info/about_rank.html) and in Methodology (www.webometrics.info/methodology.html). Together, these indicators make it possible to describe the performance of these academic institutions on the Web, beinga complement to other educational and scientific rankings. The main search engines (Google, Yahoo! Search, Live Search andExalead) are used to implement this ranking (Aguillo, Ortega, & Fernandez, 2008).Thousand institutions were selected because a digital divide was perceived between North American universities and therest of the World. If we observe the top 200 list, we detect 59.5% North American universities and 40.36% in the top 500 list(Aguillo et al., 2008). Hence, we have decided to take a wide sample that represents all the continents.An asymmetric link matrix between this set of universities was built. So, the rows show the inlinks received by each webdomain and the columns the number of outlinks that a web domain makes. This asymmetric matrix is necessary in order tobuiltdirectednetworkswhere the meanof inlinks and outlinkshave to be specify. So, linksfromthe universitydomain Ato Bare not the same to links from university domain B to A.Data extraction was made in February 2008 from Yahoo! Search. It allows several search operators in a single searchstring and the web coverage is rather wide. The following queries were used to obtain links from the university domain(A) to the university domain (B) and vice versa:site: {university domain (A)} linkdomain:{university domain (B)} and to obtain the total number of pages indexed in theuniversity domain (A):site: {university domain (A)}An SQL routine was used to submit the 1001000 needed queries to build the link matrix.  3.2. Geographical map We built a geographical map in order to show the distributionof pages and link flows at the level of countries. A base mapwas downloaded from Blue Marble Geographics web site (www.bluemarblegeo.com) in ESRI shapefile format. We used thegeographical information system (GIS) software MapViewer 6 to build the final map. This map has two layers: a hutch mapwhich represents the number of web pages by country and a flow map which shows the links between countries. Quanti-tative data such as number of web pages and in- and outlinks were added through a spreadsheet and assigned to eachmap zone (country). The classification method used in both layers was Jenks’ natural breaks ( Jenks, 1963). This methoddetermines the best arrangement of values into classes by iteratively comparing sums of the squared difference betweenobserved values within each class and class means. This method improves the visualization and the interpretation of the re-sults, because it creates more significant differences between classes.  3.3. Network graph A network graph was built with the links between the 1000 university web domains. Several variables have been used inorder to add information about the network configuration. Nodes size shows the volume of web pages that each universitypublishes on the Web, colours represent the nationality of each higher education organization and arc width shows the fre-quency of in- and outlinks between two university domains.  J.L. Ortega, I.F. Aguillo/Information Processing and Management 45 (2009) 272–279  273  The software used to visualise the network was Pajek 1.02. Links matrix was converted to a network file (.net) and thenominal variable (countries) was turned in cluster file (.clu) and in vector file (.vec) the discrete one (number of web pages).These three files were merged to present the final graph output. We selected a cut-off of minimum 50 links to improve thenetwork visualization. Also we used the Fruchterman–Reingold algorithm to lay out the network because it is the fast forlarge networks (de Nooy, Mrvar, & Batagelj, 2005).Several social network indicators were used to describe the network topology and the main characteristics of the nodes:   K-Core: a sub-network in which each node has at least degree  k . K-Cores detect groups with a strong link density. In free-scale networks, such as the Web, the core with the highest degree is the central core of the network, detecting the set of nodes the network rests on (Seidman, 1983).   Degree: the number of lines connecting a node. This can be normalized (nDegree) by the total number of nodes in the net-work. In a directed network such as the Web we can count only the incoming links (InDegree) or the outgoing links (Out-Degree). In Webometrics, InDegree has been characterised as the visibility of a web domain (Cothey, 2005; Kretschmer &Kretschmer, 2006), while OutDegree as the property to generate web traffic.   Betweenness: the capacity of one node to help connect those nodes that are not directly connected to each other. Its nor-malization is the percentage over the total number of nodes in the network. From a webometric point of view, this mea-sure allows us to detect hubs or gateways that connect different web sub-networks (Faba-Pérez, Zapico-Alonso, Guerrero-Bote & Moya-Anegón, 2005; Ingwersen, 1998). 4. Results 4.1. Descriptive analysis Prior to the link analysis we made a frequency distribution by country of the 1000 selected universities.Table 1 shows the number of universities by country, listing onlythe first 15 countries. The United States (US) universitiesare 36.9% of the entire sample, followed by the United Kingdom (UK) (6.8%) and Germany (6.6%). This distribution is alsoobserved in the Top 200 of the ranking which suggests that there is a digital divide in favour of US universities. The low per-formance of poorer countries like Russia (0.6%) and India (0.4%) is also clear. 4.2. Geographical map 1 Fig. 1 shows the geographical distribution of web pages by country and the incoming and outgoing links among thesecountries. Two regions stand out for their large amount of web pages: North America (USA and Canada) and the EuropeanUnion (EU) zone. The USA is the country with most web pages (50.57%), holding half of the world’s academic web pages in-dexed in Yahoo! Search. It is followed by Germany (7.14%) and the UK (4.28%) in the EU. Besides these zones, notice the webdevelopment of Japan (2.35%), Australia (2.35%) and China (2.33%) in the East and Brazil (.94%) in South America. In contrast,two zones have no universities in the sample: Africa (with the exception of South Africa) and the Middle East (with theexception of Israel and Saudi Arabia).  Table 1 Universities distribution by country (15 first). Country Universities %United States 369 36.9United Kingdom 68 6.8Germany 66 6.6France 50 5.0Spain 41 4.1Canada 39 3.9 Japan 35 3.5Italy 34 3.4Australia 30 3.0China 17 1.7Taiwan 17 1.7Sweden 15 1.5Brazil 14 1.4The Netherlands 13 1.3Finland 12 1.2Rest of the World 180 18.0Total 1000 100 1 Full colour pictures can be seen in the e-print version of this journal or in:http://internetlab.cindoc.csic.es/cv/11/world_map/. 274  J.L. Ortega, I.F. Aguillo/Information Processing and Management 45 (2009) 272–279  From the US position, the upper loops show the outgoing links and the lower loops the incoming ones. The most impor-tant link flows are between North American countries and EU countries, while second are links between East Asian and Oce-anic countries and the US. 4.3. Network graph The World class network (Fig. 2) shows clustering because its clustering coefficient ( C   = 527.25) is considerably higherthan the same for a random network ( C   = 35.14). And its average path length ( l  = 2.26) is also rather low (Watts & Strogatz,1998). Visually, small-world properties can be seen through the traversal links that run across the network, connecting dis-tant clusters (Fig. 2). The in and out degree frequency distributions follow a power law trend ( c in  = .81;  c out  = .73), suggestingscale-free properties as well (Barabasi, Albert, & Jeong, 2000).Fig. 2 shows the graph of the 1000 higher education institutions. First, each university is linked with the universities of itsown country. Thus, we can visually detect homogeneous national groups such as Germany (red), the UK (light green) or Japan (orange). 2 However, we can also see that there are countries that do not constitute a compact group such as France(dark blue), Canada (white) and other countries with a small set of universities such as the Netherlands (dark red). Thismay be because some countries are included in other larger national sub-networks, indeed Canada is related to the USand the Netherlands with the UK. This describes a cumulative process in which each national sub-network is aggregatedto other ones like an accreation model.The graph also shows linguistic (Thelwall et al., 2003) and geographical relationships (Thelwall, 2002a, 2002b). The European countries are located on the top side of the picture, while the bottom side is mainly taken up by Asianand American ones. It shows, for example, that Spanish (purple) universities are between the European and the La-tin-American ones. Observe that size is related to link attraction, because the large universities are located in the coreof the network. Nevertheless, some countries, specifically Asian ones (China, Japan and Taiwan), have large universitiesthat are far from the core. This may be caused by low development of English pages by these countries (Vaughan &Thelwall, 2004).The main core of the World network was detected with the  k -cores method. The central core is 116 nodes with degree 93.This highly connected cluster has 98 American universities. The rest are from Canada (7) and Europe (11). Fig. 3 shows indetail this central core, highlighting universities like Harvard, Stanford or Massachusetts Institute of Technology (MIT) which Fig. 1.  Geographical map of the distribution of pages by country and their link flows. 2 For interpretation of color in Fig. 2, the reader is referred to the web version of this article.  J.L. Ortega, I.F. Aguillo/Information Processing and Management 45 (2009) 272–279  275  are located in the centre of the graph and attract a huge amount of links from the entire network. Next, the important Euro-pean universities in the core of the network bring closer their national networks, as with Cambridge and the British network,Trier and the German one and the Swiss Federal Institute of Technology Zurich (ETHZ) and Switzerland. This principal uni-versities act as gateways that connect their sub-networks to the network core, causing a high density of the network. How-ever, there is no presence of Asian, African and Latin-American universities, with the exception of the Israeli, Turkish andsome Asiatic ones which are located around the Unites States sub-network. This may cause that countries do not linguisticand geographically integrated with other countries are mainly connected to the USA ones which have a great weight on thenetwork.We also calculated the in- and out- degree of each university according to all the network and ranked it. Hence Table 2shows the top 10 universities by InDegree. United States universities are the most interconnected in the network. MIT (78.1)and the universities of Berkeley (73.5) and Stanford (73.1) are the web domains most linked in the network (Table 2). In con-trast, Table 3 shows the top 10 universities by OutDegree. These are the universities that keep the network connected, mak-ing outgoing links. This table is also dominated by US universities, particularly the universities of Wisconsin-Madison (47),Stanford (41.8) and Florida (41.2) (Table 3). Notice that both tables only include US universities and the first European uni-versities in the indegree rank are Cambridge in 18th and Leeds in 19th. In the outdegree, the first are ETHZ in 15th and theUniversity of Amsterdam in 22nd.As above, the World network is the aggregated union of national sub-networks. The betweenness centrality index de-tects the gateway universities that tend to connect these national sub-networks with the remaining ones. Table 4 showsthe principal universities in each country according to their betweenness centrality. For example, the Japanese universitywith highest betweenness centrality in all the network is the University of Tokyo and the Taiwanese one is the NationalTaiwan University. Because this network is constituted by national sub-networks (Fig. 2), betweeness centrality may beconsidered as a suitable indicator in order to show universities with more international scope. Thus, we can identifythe outstanding universities in each country such as MIT in the US, Cambridge in the UK or ETHZ in Switzerland. Thus,these universities connect local web spaces internationally. However, there are no German or Spanish universities inthe top positions, although both countries have a good position in the network. We suggest that as there is a linguisticfactor in the relationships between countries, the German-speaking network is represented by ETHZ and the Spanish-speaking one by the Autonomous National University of Mexico (UNAM). Moreover, the betweenness index is rather closeto the degree indicators, so we can state that these universities are the most important in their national or linguistic sub-network. Fig. 2.  Network graph of the World class universities on the web ( N   = 1000 arcs P 50 links).276  J.L. Ortega, I.F. Aguillo/Information Processing and Management 45 (2009) 272–279
Related Documents
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x