Nexidia’s Language Identification (LID) toolgives users an automated way to identify and categorize
the spoken language used in an audio segment by providing the ability
to rapidly distinguish between disparate languages and dialects.
Language Identification offers the means to automatically
segregate audio/video data into identified languages with an extremely
high degree of confidence. This enables end users to only receive
the language(s) or custom language model that is most relevant to
them, not only increasing productivity but ensuring the file is
of a profile that warrants further evaluation and is being evaluated
by the proper resource.
LID has the ability to be trained on any language,
dialect or customized language model that may be of interest. A
customized model is developed to identify a language model particular
to an organizations needs, e.g. LID can be trained to classify recorded
interviews into acceptable audio and unacceptable audio for the
purpose of hiring contact center agents. Once trained, the tool
can then be used to automatically identify those languages or models
as identified by the organization. The language identification engine
allows the user to supply an arbitrary segment of audio in any supported
format so that the Language Identification Model (LIM) can then
identify which language was most likely spoken in the audio segment.
The result will provide the language or language model along with
the filename of the given file. The ability to quickly segment an
audio clip into the correct language, dialect or custom language
model is used to quickly and accurately segment audio as a means
of replacing the largely manual process currently being used by
government and commercial organizations. As a result these organizations
are benefiting from improved and quicker decision making; reduced
overhead and increased productivity.