Blog Archives
2012 in review
Posted by repplinger
The WordPress.com stats helper monkeys prepared a 2012 annual report for this blog.
Here’s an excerpt:
4,329 films were submitted to the 2012 Cannes Film Festival. This blog had 18,000 views in 2012. If each view were a film, this blog would power 4 Film Festivals
Reading Level Feature by Google
Posted by repplinger
Google recently came out with a new feature that sorts results by reading level. For example, if you were an elementary or middle school child looking up an information for a school report, you would perform an advanced search and select a basic reading level. There are a few options for refining your results to a specific reading level (see screenshot below).
This is one of the first general internet search engines that offers this function (at least that I’m aware of). There are many other subscription-based databases that offer similar tiered reading level search/refining capabilities, such as EBSCOHost’s Middle Search Plus which uses a Lexile reading level system to rate literary grade levels of literature.
It would be interesting to find out how Google categorizes pages into reading levels. Does it use a controlled vocabulary? If specific words appear a designated number of times or sequence, would be considered advanced? Or perhaps if there is a lack of advanced terminology, such as scientific names used for classification, would be considered basic or intermediate?
And could this somehow be used as a filter? For example, perhaps searches for explicit language or sexual content at a basic reading level would not yield results. While this would be very helpful for parents to help screen objectionable material from their children, it might create a hurdle for adults who are studying a language (perhaps doing a report on a health topic in a second language. This is not the best example since adults tend to be more sophisticated searchers and this should not be much of a barrier for them, but you get my point.
This is a wonderful resource for those interested only in a specific reading level range, such as scientists who want the technical information, or school aged children who might want to exclude the really technical literature. Below are directions from the Google Support Page that lists how to use this new feature.
Features: Reading level
Sometimes you may want to limit your search results to a specific reading level. For instance, a junior high school teacher looking for content for her students or a second-language learner might want web pages written at a basic reading level. A scientist searching for the latest findings from the experts may want to limit results to those at advanced reading levels.
To limit your search results to a specific reading level, follow these steps:
- On the search results page, click Advanced Search below the search box.
- Next to “Reading level” within the “Need more tools” section, select your desired reading level (basic, intermediate, or advanced) or choose to show all results annotated with reading levels.
- Click Advanced search at the bottom of the page.
- At any time, you can click the X in the right corner of the blue bar beneath the search box to go back to seeing all results.
Posted in Data Mining, Database Reviews, Google, Metadata
Tags: Data Mining, Database Reviews, Google, Metadata
Google Instant Has Arrived
Posted by repplinger
Google Instant has arrived on the scene this week. This new Google search philosophy brings the list of results to searchers while they are typing, and the results are constantly changing as the users modify their search criteria (screen shot below).
Basically, you won’t have to wait for your results to show up, at least with a Google search. This brings a challenge to Google’s closest competitors, Yahoo! and Bing (by Microsoft). These search engines will probably try to match Google’s change since Google has been the major innovator of late (perhaps it is from the 80/20% work/personal project time philosophy which has spawned Google’s innovative streaks).
This change has also stirred up negative feedback in the advertisement world where business have built their advertisement schemes around a more static model of advertisement placement. Honestly, this change should not affect advertisers much. In fact, advertisers should be ecstatic because more people will see advertisements as they modify their search parameters. Because the results page is dynamic, the results and ads change constantly with each character change. And Google has not changed the way it ranks and displays ads.
Also, there is potential for users to become even more savoy searchers as they see their results change with their choice of words. They may explore alternative search strategies that pop up that they might not have considered before, which may lead them to becoming more accurate and efficient searchers (assuming that they learn from the suggested phrases & words).
Google has posted a video on YouTube introducing Google Instant (below).
What do you think about Google Instant? I’d love to hear your opinion!
NetFlix Categories Replaces Dewey Decimal
Posted by repplinger

(Image source: Cronknews) Librarians embrace the classification change at the College of Eastern Nevada.
I ran across this interesting farce from the Cronk of Higher Education (take off of the Chronicles of Higher Education) through a recent ALA email release. It describes how an academic library drops its subject headings for the NetFlix method of organizing content. Originally, I thought this was real, which it isn’t.
Just for fun, I wondered what would happen if this were real? One can understand switching a movie collection to match those of the NetFlix, but to incorporate the book collection (plus all other materials) into this classification system would really be something!
There has been a consistent push for libraries to satisfy students and improve how library patrons access literature and all forms of information. Several libraries of late have moved to alternative classification systems to make browsing their collections more intuitive. Some have even moved to systems similar to some well-known large book chains.
However, let’s take a look at a few assumptions should a library ever make a big jump to use commercial movie organization system. They would have to change their entire collection of literature and all other types of info to fit into broad entertainment categories. They would loose their refined subject categories, sub-categories, sub-subcategories, etc. Would the library find other institutions who would partner with them in their venture, and how would the ties to other libraries be affected in a system that is not widely adopted. What would happen should the business fold (the chances of which would be significantly higher during these economic hard times)?
According to the Cronk of Higher Ed article, the institution in question, the College of Eastern Nevada, experience a lot of support for this move. I just wonder what would happen to library users who would not be familiar with NetFlix. Let’s say that the families of students, their financial support, had a NetFlix subscription. The family decides to cancel their NetFlix subscription to save money for rising tuition costs, car insurance for their students, or perhaps to offset a loss of a family job. Or even more likely, they never had a NetFlix subscription in the first place. Entering the library would truly be a completely different world, more so than it is already to some people.
Below is a screen shot from the Cronk of Higher Education post comparing the standard Dewey classification system with NetFlix.
Assuming the left column is equivalent to the right, one would have some very surprising browsing results. The entertainment content and selection is considerably different from the educational content of an academic library. Lets say, for example, a student was looking for literature on the social justice movement of African-Americans in the 1960s & 70s. Might she naturally look under the Documentary section (listed above). Also, some of the categories would seem counter intuitive, such as the Sci-Fi and Fantasy collections. I have to laugh at the thought of trying to find a book on cell biology in a science fiction section. At any rate, enjoy the Cronk of Higher Ed reading.
Click here to read related classification and metadata topics…
The Semantic Web, Coming to You Soon!
Posted by repplinger
I ran across this interesting bit of info just the other day. Google recently purchased Metaweb, a San Francisco-based semantic search company, because it “contains information on more than 12 million web ‘entities,’ from people to scientific theories.”
In other words, Google just bought a bunch of metadata. Metadata is basically descriptive information about something, such as the color of someone’s hair, their height, weight, etc. This purchase may signal that Google will soon add extra value to individual Internet resources and web sites. Ultimately, this means that your search results may become more accurate and relevant, and if Google steps up to the semantic web plate, will other search engines like BING do the same.
Here is a link for further reading at New Scientist.com which explains the details and what I could mean to you in the future.
Resource, Description, Access: the New Cataloging Standard
Posted by repplinger
ALA released the new “cataloging” standards, known as RDA or Resource, Description, Access, earlier this week. I have a feeling that life will become a lot more interesting for librarians and patrons alike because of this change. Why does this matter to the average Joe?
Libraries have been trying to incorporate online resources into the traditional library catalog since the new technologies arrived on scene. However, from the beginning these new technologies have defied the traditional library catalog classification system. They simply don’t fit the traditional “book” metadata format (metadata is descriptive information about a specific resource).
Catalogers eventually came to terms with this phenomena (some earlier than others), and they began forming a new set of organizing information or “cataloging” standard to enhance and eventually replace the current AARC2 cataloging standards.
The new standards will be geared to pull in metadata (e.g. title of a resource, description of it, and access) from online resources and make it much more useful and dynamic for the user. Hopefully, the future online catalog will be able to monitor changes to web resources and automatically update the changes by itself.
How useful would it be to have a master catalog of all resources in the United States (and even the world) of what libraries own and access. Instead of the libraries listing what materials they have, the national catalog would list resources that are available with ferberized metadata and the library could simply check a box essentially as to if they have it and where it is located. Not only would this help library patrons conceptualize what a library has to offer in terms of unique holdings and access, but it could be useful for vendors to help identify potential markets.
This is jumping the gun a little, but it is fun to dream! Below is the official news release from the American Libraries Magazine.
For Immediate Release
Tue, 07/13/2010 – 09:22Contact: Jill Davis
Publishing (pub)CHICAGO—ALA Editions, the publishing imprint of the American Library Association, announces the release of “Introducing RDA: A Guide to the Basics,” by Chris Oliver. Resource Description and Access (RDA) is the new cataloguing standard that will replace the Anglo-American Cataloguing Rules (AACR). The 2010 release of RDA is not the release of a revised standard; it represents a shift in the understanding of the cataloguing process. Oliver, cataloguing and authorities coordinator at the McGill University Library and chair of the Canadian Committee on Cataloging, offers practical advice on how to make the transition. This indispensable Special Report helps catalogers by:
- concisely explaining RDA and its expected benefits for users and cataloguers, presented through topics and questions;
- placing RDA in context by examining its connection with its predecessor, AACR2, as well as looking at RDA’s relationship to internationally accepted principles, standards and models; and
- detailing how RDA positions us to take advantage of newly emerging database structures, how RDA data enables improved resource discovery and how we can get metadata out of library silos and make it more accessible.
Oliver has worked at the McGill University Library since 1989, as a cataloguing librarian and cataloguing manager. She received her M.A. and M.L.I.S. degrees from McGill University. She is the chair of the Canadian Committee on Cataloguing and has been a member of the committee since 1997. This has given her the opportunity to be involved with the evolution of RDA from its beginning. She served as a member of the Joint Steering Committee’s Format Variation Working Group and as chair of the RDA Outreach Group. She has given presentations on RDA in Canada, the United States and internationally.ALA Store purchases fund advocacy, awareness, and accreditation programs for library professionals worldwide. ALA Editions publishes resources used worldwide by tens of thousands of library and information professionals to improve programs, build on best practices, develop leadership, and for personal professional development. ALA authors and developers are leaders in their fields, and their content is published in a growing range of print and electronic formats. Contact ALA Editions at (800) 545-2433, ext. 5418, or editionsmarketing@ala.org.
Read original source: http://www.americanlibrariesmagazine.org/news/ala/guide-rda-basics
Hybrids of Dewey
Posted by repplinger
Barbara Fister wrote a great article a few months ago on “The Dewey Dilemma.” It discusses how libraries are rethinking their customer service approach to browsing book collections and finding specific titles. According to the article, of the 200,000 libraries that use the Dewey system of classification in 138 countries around the world, most are sticking with the Dewey system. However more and more libraries are considering the benefits of using alternatives to Dewey, such as the Word Think system (see previous two posts: one and two) which uses broad categories & subcategories and then shelves books by title or hybrid systems where broad category labels are added above the call number on the book’s spine and reorganized within the library by broad categories.
The article also took into account a survey performed by the author on what librarians think of Dewey, and why patrons have trouble finding (non-fiction) materials with Dewey. It is always interesting to compare perspectives of patrons to librarians and see what each group perceives as the problem. I always have to wonder how the surveys are constructed–whether the survey is assuming that there is a problem (the problem I find most daunting is…) opposed to having a graduated response (on a scale of 1-10, how do you feel about this problem), granted it appears that there was at least an option for there is no perceived problems & things should stay the way the are.
From the results, nearly half of librarians (48.4%) surveyed said that the current system of Dewey could be improved with combining it with a more general subject scheme, followed by the sentiment that if we improve signage, patrons would be able to find what they want more easily (26.9%). Patrons on the other hand described why they have trouble finding nonfiction materials, the top four reasons are cited as:
- Trouble understanding the online catalog (68.4%)
- They feel intimidated by a classification system they don’t understand very well (66.3%)
- Want to go straight to the right shelf without having to look something up (63.2%)
- Call numbers are too complicated to use (50.5%)
What do I think? I think that we should listen to our patrons. If they think browsing materials is too difficult, something needs to be done with either better signage and/or regrouping collections to reflect a more intuitive way to browse for books. The system however must have an efficient way to track down specific titles and group similar items together, or the system misses it core purpose of organizing collections.
The hybrid systems mentioned in this article seem very promising. I was particularly impressed with the children’s division which took the opportunity to address the most common question they encounter (reading by age group). This arrangement makes enormous sense, and I wonder how this system will work in the long run. From the stats that were mentioned, it sounds like it is very intuitive to use and even little children know exactly where to go. Now I only wish my own public library would arrange its children’s books this way!
Better Search Software for Libraries
Posted by repplinger
Here is an interesting article that talks about how many current catalogs have lousy algorithms, and don’t do an adequate job of finding information within an individual library’s catalog.
After Losing Users in Catalogs, Libraries Find Better Search Software
September 28, 2009
By Marc Parry
http://chronicle.com/article/After-Losing-Users-in/48588/?sid=at&utm_source=at&utm_medium=en
Thomas Jefferson founded the University of Virginia. So you might think that typing his name into Virgo, Virginia’s online library catalog, would start you off with a book about him.
Jean A. Bauer tried it the other night. At the top of the results list were papers from a physics conference in Brazil.
The problem is that traditional online library catalogs don’t tend to order search results by ranked relevance, and they can befuddle users with clunky interfaces. Bauer, a graduate student specializing in early American history, once had such a hard time finding materials that she titled a bibliography “Meager Fruits of an Ongoing Fight With Virgo.”
Posted in Data Mining, Libraries, Metadata, Technology
Tags: Data Mining, Libraries, Metadata, Technology
Wordle
Posted by repplinger
You’ve probably seen the work of Wordle without knowing what it is or does. From a campus meeting that took place yesterday, I discovered Wordle.net. It groups and aggregates words that are displayed within a web page. For words that are used several times within a web page, the size of the word grows to emphasize its use. Words that are large are used frequently, while small words are used infrequently.
I tested Wordle.net with this blog (Library Shop Talk) to see how it compares with the purpose of this blog, and came up with the following. (This is only the home page with ten of the latest posts). Google is the most frequently occurring word. I’m not too surprised because I write a lot about Google. It was interesting to see the other words that were highlighted/emphasized: Mobile, books, publishers, source, librarians, copyright, available, read library, settlement, school libraries, groups.
Keep in mind that this is a snapshot in time, and the words will likely be very different next week. However, to make it interesting, I checked the entire blog and came up with even greater variation (see below). The results seem identical, so I wonder if it wasn’t picking up everything in the blog, and just going with the top ten again. It doesn’t even pick up technology, which has dozens of post. At any rate, it is pretty fun and it is a useful analytical tool!
Posted in Art, Data Mining, Metadata, Tech Toys
Tags: Art, Data Mining, Metadata, Tech Toys
Consider This Product: Data Mining Personal Info
Posted by repplinger
Here is a fascinating article by the NY Times on how browsers are saving personal information about you and delivering more customized advertisements.
Ads Follow Web Users, and Get More Personal
“Hello, this is Joe your personalized marketer. Since I know you so well, your preferences, price range, buying habits, I want to mention this great deal from your favorite store that you’ll definitely want to check out. They have a HUGE discount on all of your regular purchases. It is unbelievable! …”
… For decades, data companies like Experian and Acxiom have compiled reams of information on every American: Acxiom estimates it has 1,500 pieces of data on every American, based on information from warranty cards, bridal and birth registries, magazine subscriptions, public records and even dog registrations with the American Kennel Club.
Posted in Data Mining, Metadata, Privacy, Technology
Tags: Data Mining, Metadata, Privacy, Technology












A Different Kind of Web
Introduction to Modern Information Retrieval
Gold Metal Fitness