GELT ® and GESWE ® are a series of English proficiency tests for adult non-native speakers of English.
Because they are to measure different skill areas; they can be used
together or independently.
Administered online via the Internet, they are institutional tests which help organizations to recruit and
hire new employees, or identify staff in need of English training programs and ensure employees’
English-language skills meet international standards to provide solid foundation for their local or
international operations.
Non-native English speakers’ number reached over 1B people worldwide by 2019 compared to 750M native English
speakers. The importance of English as a second language is getting more important each year and is becoming
a must for all sort of institutions aiming to measure English levels of candidates and students.
1. GLOBAL ENGLISH LEVEL TEST
GELT ®
GELT is the Global English Level Test to assess English proficiency of adult
non-native speakers and it is based on multi-variation testing architecture. Every instance of the test comes
with a different set of questions, so each candidate gets completely different questions. Behind the scene, GELT
is fed by a question bank of 1000+ questions created by English language experts and mapped on items to be
measured in accordance with measurement and validation principles.
Total number of questions
80 multiple-choice questions.
Duration
45 minutes.
Sections
GELT ® is designed to measure English proficiency in four basic language skill areas:
Grammar
36 multiple-choice questions - Use of English tests involve grammar and vocabulary. Knowing
grammar and vocabulary is the heart of learning a language. Therefore, assessment of grammar and
vocabulary is one of the most common forms of testing language knowledge. Grammar and vocabulary
questions are carefully prepared in order to assess the level of the test taker in relation to
the Common European Framework of Reference (CEFR).
Vocabulary
24 multiple-choice questions - Participants are expected to answer some idiomatic, miscellaneous
expressions, vocabulary related to different fields such as sports, education, health, politics
etc., prefixes and suffixes.
Reading
10 multiple-choice questions - The reading test sections mainly focus on identifying the main
idea of a passage, making inferences and identifying the author’s purpose. Besides, text
passages build on vocabulary knowledge by allowing the reader to combine the meanings of
individual words to comprehend the overall text. The importance of vocabulary in our texts is
well established readers can find complex concepts, phrases, idioms and also phrases in spoken
language (both British and American). Moreover, the articles are the most current, interesting
and all-around best debate topics
Listening
10 multiple choice questions - The purpose of listening test sections mainly focusses on the
main of the speech, making inferences, identifying the author’s purpose and the meaning of some
idiomatic expressions. The candidates listen to the recordings then answer multiple-choice
questions that address various levels of literal and inferential comprehension. The topics are
interesting, the most current topics. Moreover, test takers should be ready to hear different
accents. (Users should use a headset to listen to the recordings and they are limited to listen
the recording for a few times only).
Repeatability: The very same person can take the test at most 11 times without being asked the
same questions: every time the person is tested with a completely different set of questions.
Grading: Since GELT is a multiple-choice question type test, grading is auto-evaluated and the
test score is reflected with a percentage between 0 and 100. Then according to the test score, test
taker’s (candidate, student, worker, etc.) result is analyzed by matching the “Test Result” with “Level”
metrics given on the below Common European Framework of Reference scheme. This process is
straight-forward and can easily be monitored and reported by test makers without any need of any extra
professional evaluation.
Norms
Each year, more than 10.000 people are taking the test. The results are analyzed quarterly.
GESWE is the Global English Speaking and Writing Exam to assess proficiency of adult non-native speakers
specifically in two main skill areas.
Total number of questions
4 (2 writing and 2 speaking)
Duration
42 minutes
Sections
GESWE is designed to measure English proficiency in two basic language skill areas:
1. Speaking Section
Candidates are given 10 minutes to respond to each question. They should
speak for at least 4 minutes and maximum of 2 attempts about a given topic. The aim of this
section is to evaluate the test takers in a series of phonological factors. Test takers have
to pay attention to aspects such as fluency, pronunciation, vocabulary, accuracy,
communication, delivery, topic development and language use. Our speaking tests involve only
monologue-type speech but not interaction.
2. Writing Section
Candidates are given 10 minutes to write each essay of at least 3
paragraphs and 200 words. The purpose of this section is to assess the writing ability of
the participants. They need to write a good, well-structured paragraph (the main topic of
the paragraph, supporting details and concluding sentence). Paragraphs are graded according
to: How the topic is developed, language use, vocabulary, spelling, punctuation, and
mechanics.
Repeatability: The very same person can take the test at most 10 times without being asked
the same questions: every time the person is tested with a completely different set of questions.
Additional requirements: Candidates should respond to each task as completely as they can and
always should speak clearly into the microphone.
Grading: GESWE is an input question type test in which test takers are answering questions
both by writing essays and speaking about given topics. So, evaluation of the answers and grading
depends on a more complicated process
First of all, answers of the test takers should be read and listened by the evaluators (in
our case qualified English teachers).
Evaluation and grading principles are described in details in the below table:
Evaluation: Evaluation of a GESWE test is finalized in 3 working days after following the
completion of the test. The test score is again reflected with a percentage between 0 and 100 and
the result is analyzed by matching the “Test Result” with “Level” metrics given on the below Common
European Framework of Reference scheme.
Little or no attempt to complete the task, and irrelevancies suggest failure to understand what
is required.
No evidence of prior planning. Probably incoherent or very short.
Frequent or serious grammatical errors, which often obscure meaning, and a restricted and
repetitious vocabulary inappropriate to the task(s).
Some attempt to complete the task, with some omissions and/or irrelevant information. Some
attempt at prior planning, but evidence of poor organization of ideas. Generally coherent piece
but meaning not always clear. Elementary grammatical errors which occasionally obscure meaning.
Tendency to use restricted range of vocabulary with some inappropriate usages.
Understands and completes the task with occasional irrelevant information. Some attempt at prior
planning and organization,
but pedestrian and showing little sign of genuine thought. Generally coherent, fairly clear
piece.
Some grammatical errors which rarely or never obscure meaning. Limited range and use of
vocabulary but mostly used appropriately.
Understands and engages with the task, but less personally or impressively.
Evidence of planning with sound organization and structure. Coherent, clear piece with effective
examples and supports for ideas.
Some minor grammatical errors, which do not obscure meaning.
Generally good range and use of vocabulary with occasional but rare inappropriate uses.
Understands and engages with the task as required, provides ample and relevant supporting ideas
and examples. Evidence of thorough prior planning and very good organization. Well-developed,
coherent, clear piece. Very few or no grammatical errors. Excellent range and appropriate use of
vocabulary. Creates interest. Some evidence of a personal ‘voice’.
Common European Framework of Reference for Languages
Results in both GELT and GESWE are given between 0% and 100% and are evaluated according to the table below
which reflects the scoring scheme named Common European Framework of Reference for Languages: Learning,
Teaching, and Assessment, abbreviated in English as CEFR or CEFRL, a definition of different language levels
written by the Council of Europe. It is a guideline used to describe achievements of learners of foreign
languages across Europe and, increasingly, in other countries:
Mapping of Test Scores and CEFR Levels
Distribution of questions by CEFR level
As seen in above table English Levels consist of 6 levels of reference: Three blocks (A or Basic User, B or
Independent User, and C or Proficient User), and two sublevels (1 and 2).
English Level A1
Can use simple phrases for basic needs, and can have basic interactions provided the other
person speaks clearly.
English Level A2
Can use English for everyday tasks and activities, understand common phrases related to
topics such as one's personal information or employment.
English Level B1
Can have simple conversations about familiar topics, describe some experiences slowly, and
deal with most situations while traveling.
English Level B2
Can communicate confidently about many topics, speak with natives without difficulty and with
spontaneity, and understand the main ideas of texts about familiar topics.
English Level C1
Can express themselves fluently in almost any situation, without the need to search for
expressions, produce clear, detailed texts on challenging subjects.
English Level C2
Can use the English language with complete mastery. Capable of differentiating finer shades
of meaning from the language even in more complex situations.
AREAS OF USE
1. Hiring
Corporate companies use GELT and GESWE mainly for:
Pre-employment processes for campus hiring, mid-level and mass recruitment
GELT is mainly used within pre-employment processes of management trainees and mid-level positions by
corporate companies since the level of English is highly recommended for both positions.
Another common use case of GELT is the campus hiring process. The main advantage of using GELT in mass
recruitment processes is the multi-variation testing architecture of the test. Meaning that every
student will be examined equally but will get different set of questions which avoids possible cheating
during the examination. The very same person can take the test at most 11 times without being asked the
same questions: Every time the person is tested with a completely different set of questions.
Additionally, GESWE is both used before international placements or promotions within the company and
for pre-employment processes if the vacant position requires high level of writing and speaking skills.
The working principle of testing mechanism for GESWE is similar to GELT. So every candidate will be
examined with different set of questions.
2. Education
Educational institutions, universities and language courses use GELT and GESWE mainly for:
English level tests for distant students (placement and selection)
English level assessment for preparatory classes and acceptance procedures
Use of English Level Test by educational institutions
GELT and GESWE are mainly used by educational institutions and language schools for distant English
proficiency assessments. They usually convert from offline forms into online tests; which facilitate and
automatize their examination and evaluation processes.
Schools and universities are heavily using GELT and GESWE for evaluating English levels of students
which applied to their institutions. They build their decision-making process of acceptance and
evaluation of students depending on results taken by these exams.
Some language schools are using GELT to determine and evaluate the progress level of students who have
terminated English classes.
In conclusion; GELT and GESWE are conducted by companies and educational institutions more than 100K
times per year. The results are satisfactory and final decisions made by the assessment of these exams
are reflecting the English levels of candidates correctly by 96.43%.
VALIDITY & RELIABILITY
Reliability and validity are the two most important properties of a test. If results of a given test are
consistent on all occasions, it can be said that the test is reliable. The level of score consistency is
therefore a measure of reliability of the test. Validity refers to the extent to which a test measures what it
is intended to measure compared to other commonly conducted tests in use.
1. Validity
The validity of a test is highly discussed topic in the world but the common consensus is that different
types of evidences get together to form a full picture of validity; focusing on the test taker, the test
system, the questions and the activities included in the test as well as things like timing and the scoring
system.
Questions to be asked to check the validity of a test should be:
“Are our exams an authentic test of real-life English?”
“Does the test do what it claims to do?”
“Does the test fit for purpose?”
“Does the test work appropriately within the same educational and social contexts and with specific test
takers?
In conclusion validity is not a number, actually it is an argument built around a set of evidence which is
expected to support any decisions that are made based on test performance.
How do we test and improve the validity of GELT and GESWE?
We conduct use cases in which same candidates are conducting both GELT, GESWE and other corresponding
tests globally used in English proficiency assessment processes.
We work with universities to test and improve the quality and the validity of our tests.
And after the evaluation period, we improve weak points, if any, once a year.
2. Reliability
An assessment is reliable if it measures the same thing consistently and reproducibly. An assessment with
high reliability should be very likely to reach the same conclusions about the candidate’s knowledge or
skills on two instances. A test with poor reliability might result in very different results across the two
occasions. Some believe that longer multiple-choice tests tend to be more reliable because more items
automatically reduce the error of measurement. Indeed, a sufficient number of items must be included to
cover the images areas tested; however, there are other factors that contribute to how efficiently a test
measures and separates candidate ability. The goal of any certification examination is to distinguish
between those candidates who are worthy of passing and those who are not. The better the test items
distinguish among candidate abilities, the less measurement error there is in the examination and the higher
the candidate separation reliability, regardless of the number of items. For example, deleting poorly
performing items makes the test shorter, but also increases the accuracy of measurement because the items
producing the most error are eliminated from the examination.
How do we improve the reliability of GELT and GESWE?
We use enough questions to assess competence.
We create a consistent environment for participants.
We provide an easy and familiar assessment user interface to the participants.
We measure reliability of both tests twice a year.
We conduct item analysis in every 3-month period to eliminate poor performing questions.
SECURE ONLINE LANGUAGE ASSESSMENT SOFTWARE
There are many English level tests, online and offline, in the market. Every year more than 250M people are
taking English Level Tests worldwide for educational, business or migration purposes. Even though the images is
of high standard,
many online language level tests are not served in a secure exam software which results with unreliable
results.
Our research shows that if the exam software does not provide a dependable security framework with monitoring,
detailed logging, proctoring, full screen enforcement, systematic images randomization, shuffling and ability to
run multiple timers at both client and server side,
users tend to cheat 12 times more. Introducing security measures decreases cheating attepts by more than
90%.
Global English Level Assessment Series is only served through the Testinvite Exam Software. Testinvite Exam
Software provides the required security framework to conduct secure language proficiency exams online.
Webcam proctoring, multiple timers, questions with rich media, navigation restrictions, multiple sections,
assessment with multiple stages, full screen enforcement, question bank integration, systematic random question
selection, shuffling are among the exam software features that make a secure & reliable online English Level
Assessment Test possible.
Test Type
The tests consist of both multiple choice and input (open-ended) questions. They are created in accordance
with the measurement and evaluation criteria and scope validity was taken into consideration based on the
TEST PLAN.
Time Limitations
Time limit for each page and time limit for total test is applied (in online test applications, the
application of both time constraints constitutes together the test security).
Test Management
The test is managed and conducted online (web-based).
Security
Full Screen: You can optionally force user's browser to stay in full screen mode during the exam.
Users will not be allowed to open other tabs or apps during the exam.
Proctoring: Both tests, optionally, can be monitored by web camera surveillance (video recording or
picture taking during exams).
Multi-variation testing (repeatability)
In online tests, repeatability is the basic condition for test security. This is an applied method
(test-re-test method) for the highest test security, validity and reliability. In a multi-variation
test, each candidate must have the same number of questions with equal difficulty levels but with
different set of questions.
Example 1: A multi-variation test can be taken from a candidate for 10 times with different set of
questions each time, in accordance with the TEST PLAN.
Example 2: When A and B candidates complete a multi-variation test, the same questions that each
candidate will face are at most 10% of the whole test.
Both GELT and GESWE are designed with multi-variation testing method to overcome operational and
security-based problems during recruitment processes.
During an online assessment which has been taken by many people, it’s possible that some of them may
face problems (technical, connection, etc.) and their assessments may be interrupted unintentionally.
This ends up with the need of repeatability of the assessment within the same standards for such
candidates. This multi-variation testing feature solves this problem and let interrupted candidates to
retake the exam with different set of questions within the same TEST PLAN and standards.
Shuffling options
There are different shuffling options within the platform which is critical for the test security: Shuffle
the order of choices, questions (within a single page) and sections (within a test).
Navigation options
There are three navigation options to be set for sections (within a test) and pages (within a section):
“Allow going forward & backward”, “allow only going backward” and “don’t allow going forward &
backward”. For both tests, this option is set as “don’t allow going forward & backward” due to security
purposes: Test takers can only follow the section and page order defined by the tester which avoids them
to preview following questions in advance.
Easy, fast, personalized e-mail Invitations with tracking
When you aim to invite candidates or students into online assessments in numbers, the invitation process
may take time. Testinvite e-mail invitation module makes this process easier and faster by using smart
tagging system. Select people to invite, create your e-mail for all your selection, add smart tags (name,
invitation code, credentials, task link, extra tags), preview your e-mail, set your sender details and
schedule delivery date for future mailing. The platform will send multiple e-mail invitations in once, so
there is no need to restart the process for each candidate.
Date restrictions
There are three date restriction settings within the platform:
Set a general date range (availability) for the whole assessment.
Determine different date ranges for each step into the assessment.
Create different availability dates for each candidate.
Reporting
Measurements obtained from both tests are reported as total score, dimension scores, in-group ranking (ranking
scale) and overall ranking.
Test results are automatically evaluated in GELT since the test consists of multiple-choice questions.