Adapting multilingual speech representation model for a new, underresourced language through multilingual fine-tuning and continued pretraining

Karol Nowakowski; Michal Ptaszynski; Kyoko Murasaki; Jagna Nieuważny

doi:https://doi.org/10.1016/j.ipm.2022.103148

インデックスツリー

RootNode

アイテム

Adapting multilingual speech representation model for a new, underresourced language through multilingual fine-tuning and continued pretraining

https://kitami-it.repo.nii.ac.jp/records/2000562

名前 / ファイル	ライセンス	アクション
2301.07295.pdf (524 KB)

Item type

学術雑誌論文 / Journal Article(1)

公開日

2024-02-28

タイトル

Adapting multilingual speech representation model for a new, underresourced language through multilingual fine-tuning and continued pretraining

言語

eng

資源タイプ

資源

http://purl.org/coar/resource_type/c_6501

タイプ

journal article

アクセス権

open access

アクセス権URI

http://purl.org/coar/access_right/c_abf2

著者

Karol Nowakowski
Michal Ptaszynski
Kyoko Murasaki
Jagna Nieuważny

抄録

内容記述タイプ

Abstract

内容記述

In recent years, neural models learned through self-supervised pretraining on large scale multilingual text or speech data have exhibited promising results for underresourced languages, especially when a relatively large amount of data from related language(s) is available. While the technology has a potential for facilitating tasks carried out in language documentation projects, such as speech transcription, pretraining a multilingual model from scratch for every new language would be highly impractical. We investigate the possibility for adapting an existing multilingual wav2vec 2.0 model for a new language, focusing on actual fieldwork data from a critically endangered tongue: Ainu. Specifically, we (i) examine the feasibility of leveraging data from similar languages also in fine-tuning; (ii) verify whether the model’s performance can be improved by further pretraining on target language data. Our results show that continued pretraining is the most effective method to adapt a wav2vec 2.0 model for a new language and leads to considerable reduction in error rates. Furthermore, we find that if a model pretrained on a related speech variety or an unrelated language with similar phonological characteristics is available, multilingual fine-tuning using additional data from that language can have positive impact on speech recognition performance when there is very little labeled data in the target language.

言語

書誌情報

en : Information Processing & Management

巻 60, 号 2

ISSN

収録物識別子タイプ

PISSN

収録物識別子

0306-4573

DOI

識別子タイプ

DOI

Versions

Ver.1

2024-02-27 23:58:31.455887

Show All versions

Cite as

Karol Nowakowski, Michal Ptaszynski, Kyoko Murasaki, Jagna Nieuważny, n.d., Adapting multilingual speech representation model for a new, underresourced language through multilingual fine-tuning and continued pretraining: Elsevier.

エクスポート

OAI-PMH

JPCOAR 2.0
JPCOAR 1.0
DublinCore
DDI

Other Formats

JSON
BIBTEX

インデックスリンク

インデックスツリー

アイテム

Adapting multilingual speech representation model for a new, underresourced language through multilingual fine-tuning and continued pretraining

× Karol Nowakowski

× Michal Ptaszynski

× Kyoko Murasaki

× Jagna Nieuważny

Versions

Share

Cite as

エクスポート