Experimental web bulletin for users of college libraries in UK - specifically for University of Cambridge but independent of official College or University sites. Posts have been non existent recently; we hope to resume more regular posting towards the end of 2006.

CAMBRIDGE, UK




Please sign the petition in support of the European Commission's proposed Open Access Self-Archiving Mandate


Lists


Categories Archives

September 2008
M T W T F S S
« Jul    
1234567
891011121314
15161718192021
22232425262728
2930  

Currently reading...
The Worms Can Carry Me To Heaven by Alan Warner
This book, his fifth novel, is a step change from his previous novels into a more experimental style which seems autobiographical in its detail switching between different times of his(?) life in Spain and his 'Home City' - never named but could be Malaga?. Warner is best known for his first novel, Morvern Callar (1996), after it was made into a movie in 2003 by British director Lynne Ramsay (also made Ratcatcher) starring Samantha Morton. Warner was chosen as a Granta Best of Young British Novelists in 2003.

Feeds Local links Literary sites Book price comparison sites Book texts free online Web search engines Open access links Check these! Network news sites Journals free online [not 'true' Open Access] Litblogs Misc weblogs Admin


Friday 18 July 2008

Search + Resources

Exporting references from Google Scholar

Some dude from Norway, Alexander Refsum Jensenius, has written a short step-by-step instruction on how to export references from Google Scholar . could be useful. Commenters point to web apps like CiteULike, WizFolio and a downloadable app for organizing PDF files called Papers - which is Mac only.


Wednesday 3 January 2007

Search

10 Google myths

Ionut Alex in his Google Operating System weblog says in his post 10 Google myths:

I’ve heard many inaccurate things about Google this year, and most of them are spread by word of mouth. Maybe Google should do a better job at explaining things that may seem trivial to computer experts, but difficult understand for other people.


Search + Resources

Live Search now covers books

Microsoft has added Live Search Books to its growing list of Live Search options - a rival project to Google Book Search. Live Search is the new improved MSN Search and Microsoft’s attempt to catch up with Google. Let me remind you that Live Search already offers an Academic option similar to Google Scholar with a useful emphasis on references. And, naturally, there’s maps as well, like Google maps but, some say, better?


Thursday 15 September 2005

Search + Resources

Google Scholar changes and Google Blog Search

Google Scholar, Google’s specialized search engine for finding ’scholarly’ material on the web (using a small subset of its main index) has introduced a broad search by subject option on its Advanced Scholar Search page. Searches can be limited to any combination of the following seven subject areas:

- Biology, Life Sciences, and Environmental Science
- Business, Administration
- Finance, and Economics
- Chemistry and Materials Science
- Engineering, Computer Science, and Mathematics
- Medicine, Pharmacology, and Veterinary Science
- Physics, Astronomy, and Planetary Science
- Social Sciences, Arts, and Humanities

On Google Scholar weblog, which first alerted me to the new search option, points out that the list of subject areas highlights one of Scholar’s drawbacks - it is heavily dominated by the Natural Sciences. As the list above demonstrates, The Arts are confined to only one category: ‘Social Sciences/Arts/Humanities’.

A commenter, who only calls himself ‘Brad’, points out another conclusion to be drawn from the subject area search: scholar is cataloging:

“Google employees are placing resources into subject categories. I doubt very seriously it’s 100% powered by AI. No, this isn’t LC cataloging, but it’s cataloging nonetheless.”

Background
Google Scholar enables web searches specifically for scholarly literature, including peer-reviewed papers, theses, books, preprints, abstracts and technical reports from all broad areas of research. Use Google Scholar to find articles from a wide variety of academic publishers, professional societies, preprint repositories and universities, as well as scholarly articles available across the web.

Google has launched Google Blog Search. Google is the first major web search engine to launch a weblog-specific search option (if you don’t count Ask Jeeves who own Bloglines - although Bloglines is primarily a web based ‘feed aggregator’ for monitoring RSS feeds). The size of Google’s Blog Search index is relatively low at around 8.7m weblogs, but this will probably increase over time. It covers all weblogs, not just those published using Google’s own blogging site Blogger. It will be strong competition for existing weblog search sites such as Technorati - up to now considered to be the blog search engine and currently tracking over 17m weblogs - twice Google’s number. There are lots of other weblog search engines, for example: Feedster, BlogPulse, Bloglines, PubSub, Blogdigger, IceRocket, Gigablast, Daypop (which uses a high-quality, but much smaller, index hand-picked by human editors) to list the better know ones.

Technorati, which many consider should be afraid - very afraid - of Google’s move into its territory, says:

“[Google Blog Search] will mark a major milestone for the World Live Web. At Technorati, we have a tremendous amount of respect for the Google team and for everything they’ve done in the world of search. I’m sure that they’ll continue to improve over the coming months, perhaps including tags, recent images and links, zeitgeists, blogger tools, and other types of semistructured data. I’m sure that they’ll also start indexing the full-text of blog posts, not just the partial text found in most blog feeds.”
This translates to: what took you so long and, by the way, don’t forget all the stuff we do that you don’t. As SearchEngineWatch points out that Google Blog Search indexes only the XML feeds and not the actual HTML weblog text:
“Although Google Blog search focuses primarily on content published to the blogosphere, it’s not a true full-text search across all sources, according to Goldman. This is because some publishers only syndicate excerpts of content via RSS. Google’s blog search indexes all of the content it finds in [RSS] feeds, but does not attempt to access and index the full content available on a publisher’s web server.”
Unusually, the blog search results default sort order in Google is by ‘relevancy’ - although they can also be sorted by date. Most blog search engines default to a date-based sort which would seem to be the most useful order (generally we are looking for ‘breaking news’). As well as the usual standard Google Search operators, Google Blog Search adds four of its own to help narrow down weblog searches:

- inblogtitle:
- inposttitle:
- inpostauthor:
- blogurl:

ResearchBuzz is impressed:

“Google has an impressive advanced search for their blog search, which Feedster should take a look at. You can search by blog title (special syntax inblogtitle: ) or post title (special syntax inposttitle: ). You can limit your searches to particular URLs. There’s also syntax to limit results by date — either a particular set of dates or a time span (last 6 hours, last 12 hours, etc.) It’s about time that someone took the delineation offered by RSS feeds and made a nice advanced search out of it. I’m sure this is only the beginning.”
To take the example Google gives in its ‘Frequently Asked Questions’ (FAQ), the search query [mandolin inpostauthor:Graham] will find blog items about mandolins written by people named Graham. Two points to note here: (1) the square brackets are Google’s ‘official’ way of marking the beginning and end of queries - only the text within the square brackets should be entered - TIP: see them as the borders of the search box; (2) as with standard Google, there should be no space after the full colon of the search operator - as in inpostauthor:Graham. Google Advanced Blog Search achieves the same search in a more user friendly way - and it is the advanced search options which really set Google Blog Search apart from the rest. Also, according to Robert Scoble’s Scobleizer weblog Google Blog Search is very fast.

Google Blog Search, like a lot of search engines now, also allows you to subscribe to a ‘feed’ of your weblog search. The feed (often referred to as an ‘RSS feed’) automatically alerts you to new instances appearing which satisfy your search query. You will need to use a ‘feed reader’ - such as the (free) web based Bloglines - to subscribe to the update feed.

It is important to note that other ’scholarly’ search engines exist. Two excellent ones (considered to be superior to Google Scholar by the academic community) are Scirus which searches over 200 million science-specific web pages and OAIster which searches amongst a constantly increasing collection of freely available (and previously difficult-to-access), academically-oriented digital resources. Scirus offers a Scirus-Google test. The Charleston Advisor published a detailed review of Google Scholar [April 2005].


Background
Google Blog Search is Google search technology focused on weblogs (or ‘blogs’). Google is a strong believer in the self-publishing phenomenon represented by blogging, and we hope Blog Search will help our users to explore the blogging universe more effectively, and perhaps inspire many to join the revolution themselves. Whether you’re looking for Harry Potter reviews, political commentary, summer salad recipes or anything else, Blog Search enables you to find out what people are saying on any subject of your choice. Your results include all blogs, not just those published through Blogger; our blog index is continually updated, so you’ll always get the most accurate and up-to-date results; and you can search not just for blogs written in English, but in French, Italian, German, Spanish, Chinese, Korean, Japanese, Brazilian Portuguese and other languages as well.
Backlinks
Google Scholar Report - could do (much) better [20 June 2005]
New! Google Scholar search for academic material [19 November 2004]


Friday 1 July 2005

Search

Is search the new file management system?

A superb article by John Hiler in his site (Microcontent News) titled Google’s War on Hierarchy, and the Death of Hierarchical Folders. Dated May 2005, it has come somewhat belatedly to my attention. In the article, Hiler proposes the demise of the folder heirarchy file storage system in favour of just searching for the file you want - oh, and Google plays a key part in all this (they pop up everywhere - I’m almost afraid to mention the ‘G’ name again so soon!). The front page of Hiler’s site features others’ comments on the article.

In case you are wondering what he is going on about, those of you who have used Gmail (Google’s free webmail service - but still only available by invitation) will have experienced it for real. Gmail encourages the use of search to find your stored emails rather than placing them in folders by subject. In fact you can just move all your read emails to the ‘archive’. Gmail also offers descriptive ‘labels’ (more generally known as ‘tags’ - another concept whose time has come) to attach to your emails and a massive 2+MB of storage. The reasoning is that, one you have stored a large number of emails (or files) the folder system becomes less practical. I know I often have to think hard about just where I stored that important file. Whereas, just typing in a few key words about it in a Google-style search box could locate it in a jiffy. In fact, that is here already in the shape of Google Desktop, a downloadable search facility which will index everything on your PC and helpfully provide a search box to find it all. (Note that there have been reservations about privacy - also with Gmail - concerning what information Google stores about you and your files/emails).

So, just as web search long ago shifted from the original Yahoo!-style ‘web directories’ hand-picked by humans (equivalent to folders) to Google-style open search - especially when Google introduced its PageRank algorithm (adapted from bibliographical citation analysis - but using weblinks instead), just as that shift happened, Hiler proposes that email and desktop search are now beginning to (and will eventually fully) migrate from folder storage systems to keyword search finding systems. And Hiler takes the whole thing much further: postulating ‘GDrive’ - storing all files on virtual hard drives in “boundless free [web] storage, with a searchable interface freely provided by Google”.

I use Gmail (in parallel with other email services). It takes time to get used to Gmail’s method of email ‘filing’: moving everything into one archive after years of messing with large numbers of folders. And this is the main obstacle - it takes effort to get used to and trust this new way of working. Folders feel ’safe’ and familiar; I feel I am ‘on top of my work’ when it is neatly filed away in the appropriate folders. Using the keyword search felt like giving up much of my own control over my work - but now I have begun to experience the benefits it seems like the way to go.

John Hiler has written a really superb article (the first of three) which also manages to include fascinating potted histories of web search, email and the folder system of file management. And who knew that Yahoo was an acronym of “Yet Another Hierarchical Officious Oracle”?


Tuesday 28 June 2005

Search

The whole world’s been Googled

Phew!! Trying to keep up with developments at Google is hard work. Just when you think it’s safe to relax they go and launch a whole new pile of applications. Today’s offerings are Google Earth and (much less exciting) Google Video Search and video player.

Google Earth logo and linkGoogle Earth, which as Stephen Downes modestly notes, “is essentially a three-dimensional representation of the entire Earth” is now available FREE!! - at least for the basic version. For a mere US$20 you can get the enhanced ‘Plus’ version which adds GPS device support, the ability to import spreadsheets, drawing tools and better printing. There is also the ‘Pro’ version which costs US$400. [NB: there has been some recent limiting of the downloading and access to Google Earth - Google seem to be trying to roll it out more evenly. However, I believe this has now been sorted out?]

A 3D interface to the planet, Google Earth combines satellite imagery, maps, Google search and so on. Now you too can ‘fly from space to your neighbourhood’, view cities in 3D and a whole bunch of other stuff: video playback of driving directions, tilt, rotate, and activate 3D terrain and buildings for a different perspective on a location, easy creation and sharing of annotations among users. You need to download a (10MB) application to make it work, it requires a reasonably powerful machine to run it, but it’s fun to try! [from Phil Bradley’s Blog]

In October 2004, Google acquired Keyhole Technologies. Google Earth is basically an adaptation of this software:

“With Keyhole, you can fly like a superhero from your computer at home to a street corner somewhere else in the world - or find a local hospital, map a road trip or measure the distance between two points. This acquisition gives Google users a powerful new search tool, enabling users to view 3D images of any place on earth as well as tap a rich database of roads, businesses and many other points of interest. Keyhole is a valuable addition to Google’s efforts to organize the world’s information and make it universally accessible and useful.”

Almost simultaneously, Google has launched a video viewer to go with its recently launched Video Search. The viewer works directly from the web page when you want to view videos that you’ve found. On this, Phil Bradley writes:

“I’ve installed it and it’s quick and easy - don’t even notice it. It works very well with video search - if you can find a video to play! I’ve done a lot of searching and found what I would imagine are very interesting videos, except that they’re not available. Seems kinda pointless, and not a little irritating to me, but there you are. Google video now with video but without uh.. video.”

Google is on the rise - its stock is reaching new closing highs this week, topping the US$300 mark for the first time on Monday at US$304.10. Its web search engine reportedly grabs 52 percent of the U.S. search market. [heads up from Google Blogoscoped].

Back to Stephen Downes to wind up this item by waxing philosophical about Google Earth (I rather like his observation):

“Toss your old paper-based atlas into the dustbin. What Google Earth demonstrates more than anything else is the difference between paper-based and digital content. It is to this difference we should be aspiring in online learning.”

First announced to the world by, I believe, and hat tip to, Blog News Channel Inside Google.

UPDATE: Guess what? Check this out: MSN Virtual Earth to take on Google Earth !!! Microsoft sends news today that founder Bill Gates has announced a MSN Virtual Earth service is to debut in the summer. The service is promised to provide:
*Satellite images with 45-degree-angle views of buildings and neighborhoods
*Satellite images with street map overlays
*Ability to add local data layers, such as showing local businesses or restaurants [etc., etc…]

And before we all I get too excited about Google Earth, Brad Hill (author of Google For Dummies no less) in his Unoffical Google Weblog says:

“I keep seeing references to Google Earth as a “3D mapping tool”. I love Google Earth, but please, journalists, chill on the 3D angle. As a 3D renderer, Google Earth is not ready for prime time. In many (possibly most) scenarios, the 3D terrain feature does not work - though when it does (check out Mount St. Helens, or the hills behind Sausolito, CA) it is spectacular. The 3D effect in selected cities was tossed into the program just before its release, and must generously be described as an early work in progress. Without any texturing, the ghostly gray images of buildings are a distraction, to be invoked only when absolutely necessary. They compare especially poorly to Microsoft’s 3D work-in-progress, which features high-rez photos of buildings at a 45-degree angle.”

This can be answered thus [thanks to Adam Podolnick]: “Microsoft’s photos of buildings at a 45-degree angle are just that, photos. No 3D rendering there.”


Monday 20 June 2005

Search + Resources

Google Scholar report - could do (much) better

Much praise has been showered on Google Scholar, Google’s specialized ’scholarly’ search engine, since it launched last November. And for those without access to very expensive mega-databases such as Web of Science (WoS) from ISI (subscribed by the University Library) and the recently launched Scopus from Elsevier, Google Scholar is (being free of charge) better than nothing.

However, Peter Jasco (Associate Professor at the Library and Information Science Program University of Hawai’i at Mänoa) is not impressed. He first reviewed Google Scholar back in November 2004 after it launched and he wasn’t too keen on it then. He reviewed it again recently and, if anything, is even less keen now. He concludes, rather damningly:

“There are certainly many journals of many publishers covered [by Google Scholar] to keep casual users, high-school and undergrad students, TV talking heads and shallow journalists happy, but for scholarly research the breadth of coverage is not sufficient, the implementation is sloppy and the software options are inferior.”

He even says that, in many cases, you may be better off using standard Google rather than Google Scholar to find ’scholarly’ material! And many others are not too keen on GS. For instance, another detailed review of Google Scholar by Martin Myhill, Deputy University Librarian at University of Exeter, appeared recently in The Charleston Advisor. Although more favourable than Peter Jasco, Myhill also has reservations:

“The vast majority of academic literature is found in the ‘hidden Web’. While Google Scholar has made valiant attempts to include a range of resources in this category, it is apparent that coverage leans heavily on the sciences, rarely includes all the offerings even from partner publishers and misses many of the quality resources which are more usually accessible to scholars through institutional subscriptions.”

The same criticisms come up repeatedly: lack of search options available, no indication of the sources Google Scholar accesses or of how often they are updated, search results too variable containing too much ‘noise’ and what, exactly, does Google consider to be ’scholarly’?

A recent improvement by Google is that libraries can now link their holdings through Google Scholar - any library or institution that has the proper link resolving software can hook into Google Scholar and provide direct links to Google Scholar search results. So, if our library adopted the system and had access to a particular reference in the GS results page, the correct link to our appropriate full-text copy would be provided. Currently, you may be denied access to articles which you should have access to because Google Scholar cannot know the correct link. I asked Patricia Killiard (Head of Electronic Services and Systems at the UL) if there were any plans for the UL to provide this GS linking service. She replied:

“The set-up needs to be done in OCLC FirstSearch. We will either do this over the summer or set about procuring an OpenURL resolver, if funding becomes available. However, the poor quality of Google Scholar’s search facilities hasn’t encouraged us to make a priority.”

A UL Websites and Subject Gateways web page states the problem, warning you not to pay for access to an article which may already be subscribed by the UL:

“Google Scholar - a new search service focused on academic content - is also available. As well as search results, Google Scholar will also give you links to articles citing the item found - providing extra lists of relevant articles. Please check our Electronic Resources pages before paying for any item - we may have a subscription to it, meaning that staff and students of the University can access it free.” [my emphasis]

A useful summary of how to use Google Scholar and the caveats that apply is available at Emory University Libraries of Atlanta, Georgia.

In summary, as with any resource, Google Scholar has its uses - but it is important to know about its limitations.


Friday 27 May 2005

Search + Resources

Google Print launches as a standalone search service

The Google Print search engine, which was covered in an earlier posting, has now launched as a standalone Google search service with its own front page. Previously, Google Print results were listed within Google’s standard results pages.

Google Print is basically a Google search of books. Google has invited publishers to send them pdf versions of their books or to let Google scan them - the incentive for publishers to do this is that they could be rewarded by increased sales of the books they allow Google to hold on its database (in the search results, a link to ‘buy this book’ is offered - Google says it gets no revenue from this). Also, Google has been (and is, as I write this, and will be for many years to come) scanning books and storing them as digital - and searchable - text files in collaboration with a select few large U.S. academic libraries plus the U.K.’s Bodleian Library at Oxford. The submissions of books by publishers is known as The Project for Publishers and Google’s own library scanning venture is known as the The Library Project. I like to think of Google Print as a way of extending the internet back in time, theoretically back to the 1450s and the invention of the printing press - and perhaps even earlier (when Google can develop reliable enough handwriting recognition software)!

Just type your search term in Google Print and any book which includes that term within its pages will be listed in the results (along with a small image of the book’s cover). Clicking on a book in the results displays an image of the page on which the search term appears (helpfully highlighted for you) and you can view a couple of pages either side of the found page. The small matter of copyright means that Google shows only a couple of pages of works still in copyright (which, generally, means those written less than 75 years ago - but this varies between different countries and depends on the edition of the work). So, you cannot just read the whole book and after browsing a few pages Google will ask you to log in (eg with your Gmail account) or create an account if you want to see more pages.

In fact, the Google Print project has been the subject of much negative comment recently and is accused of being in breach of copyright. Business Week had a good article: A Google Project Pains Publishers and even published the text of a letter sent to Google by American publishers: The University Press Assn.’s Objections (also pdf version). An excellent article in Information Today sets out the whole thing very clearly.

In addition, Europe wants to start up a competitor to Google Print to broaden the scope away from mostly English language texts. Then there’s the Internet Archive which has existed for years, not to mention a number of websites already offering full texts of out of copyright books, especially fiction, (notably Project Gutenberg), some listed in the column to the left.

I tried searching for Moby Dick (written in 1851, so it should be well out of copyright) in Google Print. I could not quickly find an unrestricted out of copyright edition. I searched for it at Project Gutenberg, and immediately came up with at least four unrestricted full text versions (albeit not images of actual book pages, but plain text files labouriously typed/scanned in by volunteers). So, Google Print is perhaps not the best source for out of copyright full text access, but it is superb as a book browsing service.


Friday 15 April 2005

Search + Resources

Google Library Project - ‘books to bits’

[With thanks to an item in Peter Suber’s Open Access News, the best source of anything to do with the fast-growing Open Access movement, I would like to pass on the following]

Google Print logoAn excellent article on the expansion of the Google Print project appeared in TechnologyReview the other day. Called the Google Library Project, it is Google’s ambitious plan to manually scan and digitize millions of books in five of America’s largest libraries plus Oxford University’s Bodleian Library here in the UK (see the Bodleian on Google’s scanning). So, virtually any book that has been published will, one day, be full-text searchable on the web through Google.
The libraries have allowed Google’s own staff to ’set up shop’ on their premises and to install Google’s own top secret high speed book scanning equipment. The Google Library Project will take many years to complete, resulting in millions of books being full text searchable through Google. Effectively, Google is attempting to ‘backdate the internet’ to the beginning of written history! Initially, only books out of copyright and in the public domain are being digitized. Each library will be given a copy of the final results with no strings attached, except that they cannot allow use that hurts Google.
Originally referred to as ‘Project Ocean’, the New York Times somehow got wind of it way ahead of the formal announcement, mentioning it towards the end of a long article about ’search engine wars’ published back in February 2004.
This good introductory article appeared in Information Today soon after the project’s official announcement at the end of November 2004.


Friday 19 November 2004

Search + Resources

New! Google Scholar search for academic material

Google Scholar, or ‘Schoogle’, a new web search service from Google, should prove useful to those of you searching for ’scholarly’ (ie academic) research articles in your studies. Google has isolated a subset of its (recently increased) 8 billion index which it considers to be scholarly material. This means that searches using Google Scholar should exclude thousands of unsuitable search results you may get using regular Google.

Google does not disclose the sources of its data or even the size of the scholarly subset. It has clearly made special arrangements with publishers and other data providers to allow it to access material in passworded subscription-only ‘deep web’ areas. As an indication of the size of the Scholar database, a search for ‘the’ using Google Scholar currently gives 289m results (compared with 8bn for regular Google).

The most interesting thing about Google Scholar seems to be its citation data, which make it an excellent citation database. Google scrapes the complete text of articles, including the citations. Google’s robots seem to be capable of reading and understanding citations - number of citations is one of the measures used to rank the search results. But, interestingly, Google Scholar actually provides a ‘cited by nnn’ link to a list of citations it knows about for each search result. Citation analysis is nothing new, but the comprehensive ones tend to be subscriber-only databases - for example Thomson’s ISI Web of Science (subscribed by the University), or Elsevier’s recently launched Scopus (not currently subscribed by the University). However, Google Scholar is freely available to all online.

Google have been working with the CrossRef organization on a comprehensive journal search project called CrossRef Search which is at pilot stage - try out the CrossRef Search box at Cambridge University Press for instance. CrossRef Search is planned as a freely accessible full text cross journal search service that, for example, any library would be able to include in its website. One cannot help wondering if Scholar is an offshoot of Google’s work on CrossRef Search. (By the way, it is possible to modify the
standard Google search URL to restrict it to CrossRef: add “&restrict=crossref” [without quotes] to the URL of a search you have done.)

Note that, unlike regular Google, you may not be able to access the full text of many of the articles found by Google Scholar. Google stipulates that abstract and citations must be accessible for every article it indexes, but not necessarily the full text of the article itself. You may be asked by a data provider to pay a small fee to see the full article - but please check with the Library first - we may have the actual journal hardcopy or you may be entitled to access the article for free via UL Electronic Resources. The UL home page now displays a message emphasizing this point. So, if you are acessing on-campus (and Cambridge subscribes to the relevant journal’s online access) you should be able to freely access the full text. Or, off-campus, you can use Athens passwords for those sources allowing it (get your Athens password from the library). Also note that some of Google Scholar’s results (from its citation gathering) will be offline resources - books, for example.

Google Scholar has one unique special syntax - author:[lastname]. The best way to search for a particular article is to use the author syntax plus a phrase (within double quotes). For example, author:einstein “theory of relativity”.

Google Scholar is currently a ‘beta’ (preview) service. Expect refinements and improvements. Some areas are not well served: conference proceedings for example, and open access (OA) articles could be flagged. Google Scholar is an attractive, easy to use search tool. But it is important to acquire a wide repertoire of search resources - access to an excellent and vast range of which are free to Cambridge students and staff through UL Electronic Resources.


 

Get free blog up and running in minutes with Blogsome | Theme designs available here