Why the name Geirfan?
Geirfan is a portmanteau word, made up of the words
gair (
word) and
man (
site): a Word Site.
You'll see the same change to the vowels of gair when it appears as part of other words, like geiriadur (dictionary) and
geirfa (vocabulary).
Man is a common element in words for places and locations,
like canolfan (centre) and
gwefan (website)
Who's responsible
for Geirfan?
Geirfan was originally inspired by a project to create a
learner-friendly list of frequently used Welsh words. The project was led by researchers from
Cardiff University, including principal
investigator Dawn
Knight and lead researcher Bethan Tovey-Walsh, along with an advisory committee from the
National Centre for Learning Welsh (Helen Prosser),
the
WJEC (Emyr Davies), and
CorCenCC (Steve Morris).
The Geirfan website and the first batch
of entries was created by Bethan Tovey-Walsh to show how the wordlist project's data
could be used as the basis of Welsh-language learning materials.
How do you choose
which words to include in Geirfan ?
The Geirfan wordlist comes from the 500-word list created by the
wordlist project mentioned above. It is therefore based on lists of the words most frequently encountered in modern
Welsh, alongside feedback from Welsh-language tutors about the types of vocabulary which
are most useful to their students.
There is currently a working list of around 600 words to be
included. This list
was compiled using data from
CorCenCC,
the National Corpus of Contemporary Welsh.
CorCenCC is a
collection of spoken, written, and electronic Welsh, collected within the past ten years.
It represents the largest corpus of modern Welsh in existence, and provides invaluable
insights about how Welsh is used today.
From the raw lists of the most frequent words encountered in
CorCenCC, a team worked to identify a core list of
vocabulary which would be most useful for learners. This step included taking into
consideration the opinions of Welsh-language teachers about the usefulness of various
types of vocabulary, and developing principles for identifying high-frequency words which
were nonetheless not suitable for a learner dictionary. If you would like to see the
frequency lists, and find out more about the process of selecting learner-appropriate
vocabulary, the project's results are
available to download here. (An academic journal
article is
also in preparation; I will add a link here once it's available.)
The first sixty words added for the pilot project were chosen in
order to illustrate the full range of content which will be added to Geirfan over time. There are therefore some closely-related families of words,
but also some which may seem randomly-chosen. The latter are almost always examples of
particular word types which were needed to test site functionality and to illustrate Geirfan's capabilities.
Who is Geirfan meant for?
The
primary audience includes anyone who's learning Welsh. Because early entries will focus on the most common words in Welsh, the dictionary
is likely to be most useful for beginners at this stage. However, the entries are comprehensive and
provide a wealth of additional information, such as quotations, tips about usage,
information about the origins of the words, and lists of related words. These features
will be useful to learners at any level, and may also be of interest to fluent Welsh speakers.
Others who might find Geirfan useful
include teachers, and parents of children in Welsh-medium education.
Is Geirfan suitable for children?
Geirfan aims to provide comprehensive information about the words
listed in our dictionary, including any offensive meanings, or meanings related to
sensitive topics. Including these meanings is essential so that learners can avoid
accidentally using an offensive word or making an unintended double entendre. However, it
does mean that you may prefer to preview the content before showing it to younger
children.
Apart from the question of content, the language used for
definitions and explanations in Geirfan will also be difficult
for younger children. Adult learners benefit from access to detailed
information about the vocabulary they are learning. Geirfan's definitions therefore aim to be
thorough and exhaustive, rather than simple. As a result, the content is probably not
accessible to children until they are in their later teens.
In summary: once a child is at an age when they can benefit from
using adult dictionaries, and once you are comfortable allowing them to do so, Geirfan may be appropriate for them. As with any other online
content, however, please check out the site yourself if you have concerns about
its suitability for your child.
How do you choose
your example quotations?
There
are three types of quotation in Geirfan:
- examples of a word in use, taken directly from CorCenCC
- examples of a word in use, adapted from CorCenCC
- invented to show a word in use
The quotations of type 1 can be found in the CorCenCC corpus in the
exact form in which they appear in Geirfan (excepting changes
to capitalize initial letters of sentences and add ending punctuation). Type 2 are
quotations from CorCenCC which needed some changes in order to make them suitable for
Geirfan. This might mean removing parts of a sentence to make
it shorter, correcting typing errors, and replacing very unusual words with ones which a
learner will find easier to understand.
Type 3 quotations are used only when there is no suitable material
from CorCenCC. This is not very usual, since the words are chosen because they are very
frequent in the CorCenCC data. One of the commonest reasons for
using a constructed example
is to illustrate an unusual initial-consonant mutation.
Even when a Type 3 example must be used, it is based as far as
possible on the
CorCenCC data. Information about words which
commonly appear together can be used, for example, so that the sentence features
vocabulary that
normally co-occurs with the focus word.
How big will Geirfan be when it's finished?
Six
hundred entries is the initial target. After that, we shall see!