Friday, November 27, 2009

Week 11 Comments

Since my comment tally is a little less than my reading tally, and because I'm nosy, I commented on people's websites.

On Letty's blog:
http://letishagoerner2600.blogspot.com/2009/11/assignment-6.html

On Casey C.'s blog:
http://cac160.blogspot.com/2009/11/assignment-5.html

Saturday, November 21, 2009

Assignment 6: Website

Here it is, in all its glory. Kthxbai.

My website

Week 10 Muddiest Point

I'm really intrigued by the deep web. I think it would be an excellent thing if it gained greater visibility. The question is, how can this be done? People most often find websites through search engines, and the reason that these websites are on the search engines is that other people have found them previously and supplied links to them. It's a bit of a cycle. The most popular sites gain more popularity because of the fact they are already popular. Therefore, how do we introduce virtually unknown sites, like those found in the deep web, to the average web surfer? Or, will we only be able to benefit from these websites if we go beyond the search engine?

Sunday, November 15, 2009

Saturday, November 14, 2009

Week 10 Readings

Web Search Engines
To me, the most astounding thing about web search engine is how they are able to (with moderate success) replicate human qualities. As many of our readings in LIS 2000 demonstrated, subjects like relevance are largely based on human intuition. Search engines must duplicate the thought process we as librarians do to find sources that patrons need using only complex algorithms. By using numbers, they are able to make relevance quantifiable. It also interests me how the search engines use slightly features to optimize their searching. They have a "politeness" function that stops them from bogging down one particular crawler. They are constantly improving their ability to detect spam websites. They know what key word or phrase will give them the best results in their search. It is truly amazing how web searches are able to translate human concepts like relevance into the language of technology.

Current Developments and Future Trends for the OAI Protocol for Metadata Harvesting
The OAI Protocol seems to be at an interesting crossroad. From what I gathered, the OAI has the potential to be a umbrella metadata harvesting system for many diverse content management systems. However, a lot of its currently problems are due to the fact that these systems and their service providers are unique. The article mentions how repositories vary in the levels of completeness and how thorough their metadata is. Furthermore, different service providers have different standards and tagging methods for their systems. The article later states that the OAI community itself is "very loosely federated" and that "a more formal method of communication between data and service providers is needed." To me, it seems that the success of the OAI community hinges on whether or not time and makes these systems more compatible.

The Deep Web: Surfacing Hidden Value

This article's characterization of the deep Web surprised me on a number of levels. First of all, its massive size is unexpected. When we think of the Web, we think of something that is constantly changing due the ephemeral nature of websites. It is odd to think that such a large amount of information remains. Speaking of such things, I was also surprised when the website characterized the deep Web as relevant. Again, the ephemeral nature of websites has lead us to think of anything more than a month or even a week old as too old for the Internet.

Most interesting of all was how the deep Web illustrates the importance of metadata. Because the deep web is so massive, it cannot be browsed or tagged as easily as the surface web. Because of its lack of metadata, it is invisible to search engine crawlers, like it doesn't even exist. The article states that "serious information seekers can no longer avoid the importance or quality of deep Web information." It will be very interesting to see how they manage to bring the deep Web to the attention of search engines.

Week 9 Muddiest Point

Since the last class was not on the subject of XML, but digital libraries, I do not have a muddiest point.