babelport.com Logo
Where words become business...
  Home  |  My BabelPort  |  Projects  |  Directories  |  Community  |  Tools/Extras  |  About     German | English |
Membership Info
Forgot your password?
Register now
Welcome Members
rosett
ssalvas
liorlib
Translation Office 3000
Infobox
>Lost in Translation?
> Intellectual Property and Copyright:The case of translators
>How to Select a Translation Agency
See all articles>
Navigation
HOME
>News
>Articles
>Remote News Feed
>Contact
>Register
MY BABELPORT
PROJECTS
DIRECTORIES
COMMUNITY
TOOLS/EXTRAS
ABOUT
An Ad

Translation Contract. A Standards-based Model Solution
Uwe Muegge


Read Related Articles
> Hints for Outsourcers..
> Hints for Translators..
> How to calculate your per word rate..
> How to post a job..
> Wie schreibe ich ein Projekt aus?..
> Tipps für Auftraggeber..
> Wie berechne ich meinen Preis?..
> Tipps für Übersetzer..
> Betreiben eines Mahnverfahrens..
> How to get listed correctly in the babelport.com d..
> Richtig im babelport.com Verzeichnis gelistet werd..
> Dealing with clients not paying your invoice..
> How to prepare yourself for the challenges as a fr..
> Rentabilitätsberechnung im Dolmetscherberuf..
> Rentabilitätsberechnung im Übersetzerberuf..
> Translation Misconceptions..
> Spanish Translations - What to do when a word does..
> Mediation as translation or translation as mediati..
> Working from audio recordings..
> Lost in Translation?..
> e-Dictionaries..
> The Translator's Practice: An Interview With Brett..
> Why I don't like bidding systems..
> The Viewer as the Focus of Subtitling: Towards a V..
> Revelations of a Case Style in a Vehicular Acciden..
> Add Value...And Start Collecting Your Money..
> Interpreting Evidentiary Tape Recordings:..
> BEST FACE FORWARD: IN PERSON MARKETING SKILLS FOR ..
> Profitability Guide for Translators..
> Profitability Guide for Interpreters..
> Muttersprachenprinzip und Ziellandprinzip im Über..
> Virtual Networking 101 for freelancers..
> Desktop Search Programs..
> How valuable time can be to a freelance translator..
> Looking for answers within: an introspective look ..
> How to Select a Translation Agency..
> On the translation of military ranks..
> Intellectual Property and Copyright:The case of t..
> Misreading and Mistranslation..
> On Teaching Forms of Address in Translation..
> Getting started as Freelance Translator..
> Media Tip Sheets - Reaching Your Audience in a Glo..
> Polishing Your Translation Style-Part 1..
> Polishing Your Translation Style-Part 2..
> Polishing Your Translation Style-Part 3..
> Translation for the global travel industry: attent..
> Tips for Using Interpreters in a Legal Setting..
> Financial Translation Tip Sheet..
> Translator Prerequisites and the A-Z of Becoming a..
> Translation of Italian Recipes: Localization?..
> Translation of Internal Reports & Communications..
> Translation Problem Areas..
 
 
See all articles
 
Back to Index
 

View Article ratings
Printerfriendly View

Search Engine Features and Search Techniques

by Maria Antonietta Ricagno

When using a search engine, the most difficult problem to solve is the huge amount of results you get and the importance to be granted to them.

As a matter of fact, the efficiency of a search engine is mainly due to its ability of listing the results of our search giving the higher rank to the most important topics found.

In addition to that, its efficiency is also shown by its capacity of interpreting our query, which is a very difficult side of the job, as it is an automatic mechanism, not a human being endowed with intelligence.

When we submit a keyword to a search engine, the poor server that has to do the job starts browsing any possible references into the database, and then picks up all the occurrences which may be of interest to us. Afterwards, it sorts them according to a criterion depending on the relevant algorithm of the engine in question.

The main engines distinguish from one another thanks to their different criteria, which should be useful to know in order to better take advantage of the engines, possibly using them in different ways to get different results or obtaining different solutions.

At the same way, it should be helpful to use advanced search methods more often, instead of the usual, simple input of a series of words.

This article intends to carry out a more precise exploration on how search engines are used, in addition to describe the operation of some meta-engines that are also considered accurate and useful.

The logic method valid for all search engines is still the frequency of occurrence of terms in the metatags, along with those found within the page.

We call 'metatag' a series of web page descriptors such as its title, description (that will be hidden to the browser), keywords and several other fields (e.g. author and language).

GOOGLE.

SIGNIFICANCE CRITERIA.

Doubtless, it is the one also the reader of this article uses more often. As a matter of fact, it is the more submitted URL and it covers 90% of the queries made to all engines.

The importance of results is based upon an algorithm consisting of about a hundred parameters.

However, the guidelines are well defined: the pages with the higher LINK POPULARITY are considered as the most important, as well as the pages with an acceptable frequency of occurrence of the words searched for.

The first concept means that a higher number of external connections to that page define its condition as significant.

The second concept establishes that if many words are repeated within the page, then the topic of that page is the one you are looking for.

The third concept states that the words searched for and that are closer to each other are more significant that other words that, despite their occurrence within the page, are not so close to each other.

Advanced search criteria.

Strings and words:

These criteria enable the user to reduce considerably the number of results and to obtain a more accurate choice among them, if you want to look for one or more words occurring together

for example:

software and localizzazione

software+localizzazione

a higher accuracy and discrimination can be obtained by quoting the phrase you are looking for:

"localizzazione software"

In the Boolean logics, these are an AND-type searches, because you want to get the occurrence of the words required.

There are two further Boolean criteria that can be useful:

- looking for a word and for an alternative one (OR), so that you will look for all the pages containing the term 'software' or 'localizzazione". In that case, you will get a sum of the two criteria and therefore a higher number of results.

localizzazione OR software

- looking for pages not containing a certain term; of course, should we carry out a search with the only aim of meeting this criterion, we would get such a redundant list to be totally useless.

Instead, this criterion results much more effective if used in conjunction to one of the two previous criteria described above.

For example, you may want to search for pages containing the phrase "localizzazione software", but not the word "Microsoft".

Therefore, your search would be: http://www.google.com/search?as_q= "localizzazione software" - Microsoft

Notes:

Common terms such as article and prepositions are not considered during the search:

on the other hand, in case you want they represent the search criterion, you should add the symbol'+'. Ex: 'localizzazione+di+software'

Other criteria reducing the set of results are the following:

Looking for documents restricted to one editing language. This is a risky criterion, as not every document describes the editing language in its metatags. Anyway, whichever document containing the editing language in its metatags would give a peculiar importance to that propriety, so it would be a quality criterion.

The criteria would be:

search?as_q=localizzazione+software&lr=lang_ita to look for the pages in Italian only.

- Looking for documents in a specific file format or leaving out such documents from the search.

Using this criterion, you will get no data search value, but it is useful to leave out all the documents whose formats you cannot acquire or do not want to acquire.

Looking for documents with a data range.

This criterion enables to establish 'a priori', i.e. from the beginning, if your search will expire or if you desire to get only recent documents or not.

Using GOOGLE, the criterion is restricted to three meta-groups (all, last 3 months, last 6 months, last year).

If you want to leave out the .pdf's, the phrase to be submitted will be:

localizzazione software -filetype:pdf

Looking for terms placed in specific domains (or leaving them out).

For example, you may want to search for pages (in the Italian version) dedicated to software localization within my website, so the phrase would be:

localizzazione software site:antotranslation.com

Using this criterion, the search is carried out within a domain or a grouping of domains.

For example, all the domains of the 'italia' hierarchy (.it)

Looking for terms placed in specific areas of the text (or leaving them out).

You may want to extrapolate only those documents containing the searched term in their titles, texts, URL address or internal links.

Searching using the title criterion can represent a significant criterion, as if a term appears in the title, no doubt it is more important than another term appearing only within the text, because it is the title that provides that main definition of the document contents.

Note that the engine defines as title the 'H' htm tags and the phrases with a graphic body exceeding the standard size.

To look for the phrase into the title:

allintitle: localizzazione software

To look for the phrase into the body:

allintext: localizzazione software

Besides, also the presence of the term into the domain name identifies more accurately the importance of a topic.

If a page is called 'localizzazionesoftware.htm' it is very likely to deal with software localization.

To look for the phrase into the web address:

allinurl: localizzazione software

YAHOO

SIGNIFICANCE CRITERIA.

The text in the page, the title and description accuracy, its address (URL), its source, the links contained in the page and in other pages quoting it, and other features of the website.

Advanced search criteria.

In Yahoo, the advanced search covers many of the criteria previously described for Google.

The syntaxes for exact phrase, OR, AND and exclusion (leaving out) are totally similar.

The presence of the word in the title

intitle:localizzazione+software

The presence of the word in the domain

inurl:localizzazione+soft

The presence of the word in the title

intitle:"localizzazione software"

localizzazione OR software

Search in domain: &vs=www.antotranslation.com

Search for file type: &vf=pdf

Search for language: &vl=lang_it

ICEROCKET

Advanced search criteria.

Exact phrase:

"localizzazione software"

OR

localizzazione OR software

exclusion

-localizzazione -software

domain

localizzazione software site:antotranslation.com

News are divided into 5 categories, and their search is good

MSN

Advanced search criteria.

Exact phrase:

"localizzazione software"

OR

exclusion

-(localizzazione software)

dominio

localizzazione software site:antotranslation.com

domain

link:antotranslation.com

Country of origin

(loc:IT OR loc:AU)

language:

language:it

A peculiarity of MSN Search is the possibility to calibrate the visibility of results using three scroll-bars in the advanced search, in visual mode, or setting some values in the range

0..100 in the command string.

The criteria are the following:

exact match {mtch=50}

popularity index (link popularity) {popl=50}

page refresh index {frsh=50}

ALLTHEWEB

Advanced search criteria.

In ALLTHEWEB, the advanced search covers many of the criteria previously described for Google. The syntaxes for exact phrase, OR, AND and exclusion (leaving out) are totally similar.

The presence of the word in the title

title:localizzazione+software

The presence of the word in the domain

url:localizzazione+soft

Search in a website

site:www.antotranslation.com

Search in a domain

domain:.it

Search for file type

http://it.search.yahoo.com/search?va=localizzazione+software&vf=pdf

Search for language

http://it.search.yahoo.com/search?va=localizzazione+software&vl=lang_it

HOTBOT

Advanced search criteria.

Currently, Hotbot has the most enhanced advanced search system. It features all the characteristics already described for Google, as well as an unlimited time filter - differently than Google and Yahoo -, and the file formats used to set the searches are sorted per best number and quality.

The word definition filter is more detailed and you can combine either the position of the terms within the document and their individual inclusion/exclusion. For example, you can search for the word 'software' in the title and the word 'localizzazione' in the URL.

Finally, you can set these criteria in HOTBOT to carry out a direct query of the GOOGLE database (the largest one) and the ASK JEEVES database.

ALTAVISTA

Advanced search criteria.

In ALTAVISTA, the advanced search covers many of the criteria previously described for Google and Yahoo.

Like in HotBot and ASK JEEVES, the time filter is much more flexible, and you can compose a real date; besides, you can define a range per year, months and weeks.

Finally, you can compose a SQL-style search string by combining the elements through the Boolean logics (for advanced users).

TEOMA

SIGNIFICANCE CRITERIA.

In Teoma, significance is defined as 'authority' and is very similar to the 'link popularity' in Google; in addition, Teoma assures the exclusion of any links to spam websites. The characteristic of Teoma is the list of terms suggested along with the searched words.

Another feature connected to the searched terms is the list of websites containing related link collections. This is a powerful feature that enables the user to increase the search very accurately.

Advanced search criteria.

They are very similar to those used by HOTBOT, besides these criteria handles either word plurals and derivatives on an implicit level.

GIGABLAST

Advanced search criteria.

All the criteria related to terminology, file type, presence of the terms in URLs and page name. These are the syntaxes to be used:

  • suburl:
  • site:
  • url:
  • title:
  • ip: (if only the tcp/ip address is known and you want to display other information)
  • link: -link:(exclusion)
  • type:pdf type:doc type:xls type:ppt type:ps type:text

The exposure of results will also present the percentage of occurrences of the searched words appearing among the results obtained. These occurrences represent in turn suggestions of alternative terms.

ENTIREWEB

Advanced search criteria.

All the criteria related to terminology, language, geography, presence of the terms in URLs and page name.

LYCOS

One of the features of Lycos is the presence - among the resources related to the search engine - of a resource specialised in searching for discussion utilities related to the topic you are looking for (forum, mailing lists, etc.) Also the news search engine through keyword is very good.

Advanced search criteria.

All the criteria related to terminology, language, date range, presence of the terms in URLs and page name.

META-ENGINES

MAMMA

Advanced search criteria.

All the criteria related to terminology, language, geography, presence of the terms in URLs and page name.

This meta-engine enables the selection of which directories you will search in:

  • Open Directory
  • Looksmart Directory
  • Business.com
  • About.com
  • Mamma's Collection

and which search engines:

  • Teoma
  • Google
  • MSN
  • Entireweb
  • Gigablast

IXQUICK

You can use natural language or complex Boolean searches supporting phrases, wildcards (meta-characters), skipped terms, mandatory terms, brackets and other modifiers such as NEAR (similar to), as the meta-engine knows which search engines can carry out complex searches.

Duplicates are removed, but they are added in order to give the result most importance; therefore, if you got the same result in more than one engine, the page will be given more importance.

Meta-characters can change a character to any other.

The NEAR command enables to define a term related to another.

These are the syntaxes to be used:

  • +title:
  • +domain:
  • host:
  • immagine:
  • image:
  • url:
  • link:
  • text:
  • related:

You can select the used engines according to the national version in use.

As a matter of fact, this meta-engine uses a pool of search engines including, in addition to the most important ones, also those on a national level.

You can ask you query using a conversational language, and they will be transferred to those search engines which accept that kind of search.

CLUSTY

In the result window, Clusty presents a list of terms related to the context of the query. This enables the user to look for the source topic in an alternative manner.

Advanced search criteria.

All the criteria related to terminology, language, presence of the terms in URLs and domain. Synthax used:

domain:

host:

selection of search among:

  • GigaBlast
  • MSN
  • Lycos
  • Looksmart
  • Wisenut
  • Open Directory
  • Overture

WEBCRAWLER

Advanced search criteria.

All the criteria related to terminology, language, date range, presence of the terms in URLs and domain.

Maria Antonietta Ricagno© All Rights Reserved

© Aug, 2nd.2005, Maria Antonietta Ricagno for BabelPort

Latest News
CAT: PASSOLO releases new service release
TIN: One step further towards single European patent
TIN: Survey shows 80% lost revenue due to translation errors
TIN: Technology CEO of the Year: SDL's Mark Lancaster
More News>
Transhelp Requests
RU>EN:
???????????? ?????? ?????? ???????

Notice: Abfrage konnte nicht ausgeführt werden (query): You have an error in your SQL syntax. Check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 3 SELECT count(*) cntuser FROM tp_users WHERE userid = in /www/htdocs/cpibpadm/_incs/db/db.inc.php on line 162

Fatal error: Call to a member function on a non-object in /www/htdocs/cpibpadm/_incs/members/administration.php on line 59