Developing an information search strategy: Key words
Words in documents
- Indexation of the full text (for instance: Google): all the words in the documents are key words, entry points to the documents.
- Indexation restricted to metadata: The only available entry points are: descriptors added "manually" by reference librarians or authors (keywords, classification indexes and summaries) and the titles and authors' names of the documents. The key words are taken from diverse lists, thesauri, classifications and other catalogues or APE codes.
Library catalogues and many databases do not get indexed in full text. To run a request in this type of tool, you need to visit these lists, thesauri and reference classifications or at least be aware of them.
Reducing noise in a search on a full text
If you operate on a full text, there is little chance to miss out on documents through forgetting a word or its synonym. However the noise generated by this type of indexation needs to be limited by using rare, discriminating and ideally topic- and community- or author-specific terms.
For instance: A request on Google with the french term "socio-technique" will direct you straightaway to the works of the innovation sociology centre (centre de sociologie de l'innovation) of the Paris-based Ecole des Mines. This is an excellent term for spotting – in one single request – a part of its works, or of the works of those who refer to it.
Your search topic
One possible method of analysis is the 6-point 5WH Method:
- Who? Who are the players? Which companies, which laboratories, which institutions, which people are involved?
- What? Which studies, processes, activities, fields, experiments … are we talking about?
- When? Should we take on a historical perspective?
- Where? Should we take on a geographical perspective?
- How? How could we describe the object of research?
- Why? What are the aims of the research? What is the object of research destined for? What is it for? What does it apply to?
This first brief exploration allows you to list an initial series of concepts defining the subject.
The next step is to translate these concepts into key words for a document search.
Optimising a search in library catalogues
When a search engine, like those used in libraries, only work on titles and descriptors added by archivists or librarians – only a few words – there is a risk of silence ("no result" message).
In order to maximise a search on these terms and not miss out on any relevant document, you should:
- multiply in your search the amount of synonyms, equivalents, specific or generic terms
- refer to the reference indexes, lists and classifications.
Examples of requests ran in a library catalogue for a search on:
- safety in a chemistry laboratory: (Safety OR security OR risk) AND laboratory AND chemistry.
- violence in the couple: (violence OR abuse OR victim) AND (couple OR partner OR family)
Efficiency of the descriptors:
However, because descriptors used by librarians belong to a documentary language recorded in professional tools such as thesauri, they are less polysemous than natural language words and will presumably gather under one term documents bearing on one same topic.
On notions of noise and silence, please refer to the Methodoc guide (Rennes 2 University) on research.
The main interest of thesauri is that they are commonly used by archivists and librarians and provide a context for the searched concepts: generic, specific or related terms.