Given here at Manchester to a packed computing science lecture theatre. A great speaker, Fabrice Caillette, I really enjoyed the talk. As the duration was only an hour in the length, the topic focused on the search engine and how it works.
A couple of things that particularly interested me was “sharding” whereby the google dataset is distributed across multiple databases. Apparently this is one of the most important topics at Google. A quick search on this shows that Google has released this technology to the open source community, in the form of the hibernate shards project.
The other interesting thing was “stemming”, something you’re probably already familiar with. You type in your search query, get three results, then the second half of the page contains the results Google thinks you were looking for. This was described as “Give me the results I want, not what I asked for”, and Fabrice said that this is particularly difficult to implement. Be nice to see something like implemented for scientific literature searching.
Google Tech Talk
Given here at Manchester to a packed computing science lecture theatre. A great speaker, Fabrice Caillette, I really enjoyed the talk. As the duration was only an hour in the length, the topic focused on the search engine and how it works.
A couple of things that particularly interested me was “sharding” whereby the google dataset is distributed across multiple databases. Apparently this is one of the most important topics at Google. A quick search on this shows that Google has released this technology to the open source community, in the form of the hibernate shards project.
The other interesting thing was “stemming”, something you’re probably already familiar with. You type in your search query, get three results, then the second half of the page contains the results Google thinks you were looking for. This was described as “Give me the results I want, not what I asked for”, and Fabrice said that this is particularly difficult to implement. Be nice to see something like implemented for scientific literature searching.