Supported browsers: Edge, Chrome, Safari, Firefox
UniDicのロゴ コーパス開発センターのロゴ
UniDic for Contemporary Written Japanese unidic_bccwj

A dictionary for analysis to analyse contemporary written text automatically in Short Unit Words. You can use it to analyse text, such as for the following:

Books, magazines, newspapers, white papers, certified textbooks, public relations papers, Web bulletin boards, poems, rhymes, legal documents, and Diet minutes, etc.

When publicising the results of a study conducted using this dictionary for analysis, clearly state this fact. Refer to the literature in the references as needed.

Lexicon size (UTF-8)

Licences
GPL v2.0/LGPL v2.1/BSD New

unidic_bccwj_download_button

Old versions

参考文献 (References)
References
UniDic for Contemporary Spoken Japanese unidic_csj

This is a dictionary for automated analysis of Short Unit Word transcribed text of modern spoken words. The “Corpus of Everyday Japanese Conversation (CEJC)” is used for learning, so it can be used to analyse the text according to the transcription criteria.

Criteria for transcribing CEJC can be found in the following literature:

  • 臼田泰如, 川端良子, 西川賢哉, 徳永弘子, 小磯花絵: 『日本語日常会話コーパス』の転記基準について, 言語処理学 会第23回年次大会発表論文集, pp.174-177 (2017).
  • 川端良子, 川端 良子, 臼田 泰如, 西川 賢哉, 徳永 弘子, 小 磯 花絵: 「日常会話コーパス」の転記基準と作業工程, 言語資源活用ワークショップ2016予稿集, pp.296-306 (2017).

When publicising the results of a study conducted using this dictionary for analysis, clearly state this fact. Refer to the literature references as needed.

Lexicon size (UTF-8)

Licence
GPL v2.0/LGPL v2.1/BSD New

unidic_csj_download_button

Old versions

参考文献 (References)
  • 岡 照晃: 「言語研究のための電子化辞書」, コーパスと辞書, 講座 日本語コーパス 7, pp.1-28, 朝倉書店 (2019).
References
  • Yasuharu Den, Junpei Nakamura, Toshinobu Ogiso, Hideki Ogura. A Proper Approach to Japanese Morphological Analysis: Dictionary, Model, and Evaluation, In Proceedings of the sixth international conference on Language Resources and Evaluation (LREC 2008), pp.1019-1024 (2008).
古文用UniDicS
UniDic for Historical Japanese unidic_chj
  1. 旧仮名口語UniDic (UniDic for Old Kana Colloquial Japanese)
  2. 近代文語UniDic (UniDic for Modern Literary Japanese)
  3. 近世口語(洒落本)UniDic (UniDic for Edo Period Colloquial Japanese)
  4. 中世口語(狂言)UniDic (UniDic for Muromachi Period Colloquial Japanese)
  5. 中世文語(説話・随筆) UniDic (UniDic for Kamakura Period Literary Japanese)
  6. 中古和文UniDic (UniDic for Heian Period Japanese)
  7. 上代(万葉集)UniDic (UniDic for Nara Period Japanese)
旧仮名口語UniDic
UniDic for Old Kana Colloquial Japanese unidic_chj

This is a dictionary for automated Short Unit Word analysis of texts written in colloquial old kana. It can be used primarily for analysing magazine articles written in colloquial old kana.

If you intend to use the product for profit, please consult the following contact points beforehand.

When publicising the results of a study conducted using this dictionary for analysis, clearly state this fact. Refer to the literature in the references as needed.

Licence
クリエイティブ・コモンズ・ライセンス

unidic_chj_download_button

Old versions

参考文献 (References)
  • 小木曽智信: 「旧仮名遣いの口語文を対象とした形態素解析辞書」, じんもんこん2012論文集, pp.25-32 (2012).
References
  • Toshinobu Ogiso, Mamoru Komachi and Yuji Matsumoto. Morphological Analysis of Historical Japanese Text, Journal of Natural Language Processing, Vol.20, No.5, pp.727-748 (2013). [in Japanese]
  • Tomoaki Kouno and Toshinobu Ogiso. Improving an Electronic Dictionary for Morphological Analysis of Japanese: Use of historical period information, In Proceedings of The 9th International Conference of ASIALEX (ASIALEX2015) (2015). [can not read online]
近代文語UniDic
UniDic for Modern Literary Japanese unidic_chj

This is a dictionary for automated analysis of Short Unit Word of modern essay articles. It can be used to analyse modern magazines such as Meiroku Zasshi, Taiyou, and Kokumin no Tomo.

If you intend to use the product for profit, please consult the following contact points beforehand.

When publicising the results of a study conducted using this dictionary for analysis, clearly state this fact. Refer to the literature references as needed.

Licence
クリエイティブ・コモンズ・ライセンス

unidic_chj_download_button

Old versoins

参考文献 (References)
  • 小木曽 智信, 小町 守, 松本 裕治: 「歴史的日本語資料を対象とした形態素解析」, 自然言語処理, Vol.20, No.5, pp.727-748 (2013).
References
  • Toshinobu Ogiso, Mamoru Komachi and Yuji Matsumoto. Morphological Analysis of Historical Japanese Text, Journal of Natural Language Processing, Vol.20, No.5, pp.727-748 (2013). [in Japanese]
  • Tomoaki Kouno and Toshinobu Ogiso. Improving an Electronic Dictionary for Morphological Analysis of Japanese: Use of historical period information, In Proceedings of The 9th International Conference of ASIALEX (ASIALEX2015) (2015). [can not read online]
近世口語(洒落本) UniDic
UniDic for Edo Period Colloquial Japanese unidic_chj

This is an analysis dictionary for automated Short Unit Word analysis of modern spoken materials. It is primarily used to analyse stylish or personal text.

If you intend to use the product for profit, please consult the following contact points beforehand.

When publicising the results of a study conducted using this dictionary for analysis, clearly state this fact. Refer to the literature in the references as needed.

Licences
クリエイティブ・コモンズ・ライセンス

unidic_chj_download_button

Old versions

参考文献 (References)
  • 小木曽 智信, 市村 太郎, 鴻野知暁: 「近世口語資料の形態素解析の試み」, 第4回コーパス日本語学ワークショップ予稿集, pp.145-150 (2013).
References
  • Toshinobu Ogiso, Mamoru Komachi and Yuji Matsumoto. Morphological Analysis of Historical Japanese Text, Journal of Natural Language Processing, Vol.20, No.5, pp.727-748 (2013). [in Japanese]
  • Tomoaki Kouno and Toshinobu Ogiso. Improving an Electronic Dictionary for Morphological Analysis of Japanese: Use of historical period information, In Proceedings of The 9th International Conference of ASIALEX (ASIALEX2015) (2015). [can not read online]
中世口語(狂言) UniDic
UniDic for Muromachi Period Colloquial Japanese unidic_chj

This is an analysis dictionary for automated short-unit analysis of medieval colloquial materials. It can be mainly used to analyse texts such as Sharebon (late Edo-period novelette about life in the red-light districts) and Ninjobon novels.

If you intend to use the product for profit, please consult the following contact points beforehand.

When publicising the results of a study conducted using this dictionary for analysis, clearly state this fact. Refer to the literature in the references as needed.

Licence
クリエイティブ・コモンズ・ライセンス

unidic_chj_download_button

Old versions

参考文献 (References)
  • 小木曽 智信, 鴻野 知暁, 市村 太郎: 「狂言台本の形態素解析」, 日本語学会2015年度春季大会 (2015).
References
  • Toshinobu Ogiso, Mamoru Komachi and Yuji Matsumoto. Morphological Analysis of Historical Japanese Text, Journal of Natural Language Processing, Vol.20, No.5, pp.727-748 (2013). [in Japanese]
  • Tomoaki Kouno and Toshinobu Ogiso. Improving an Electronic Dictionary for Morphological Analysis of Japanese: Use of historical period information, In Proceedings of The 9th International Conference of ASIALEX (ASIALEX2015) (2015). [can not read online]
中世文語(説話・随筆) UniDic
UniDic for Kamakura Period Literary Japanese unidic_chj

This is an analysis dictionary for automated Short Unit Word analysis of medieval Japanese text. You can use it to parse text, such as the following:

Konjaku Monogatarishū (Honchō-bu), Uji Shūi Monogatari, Jikkinshō, Hōjōki, Tsurezuregusa etc.

If you intend to use the product for profit, please consult the following contact points beforehand.

When publicising the results of a study conducted using this dictionary for analysis, clearly state this fact. Refer to the literature in the references as needed.

Licence
クリエイティブ・コモンズ・ライセンス

unidic_chj_download_button

Old versions

参考文献 (References)
  • 小木曽 智信, 小町 守, 松本 裕治: 「歴史的日本語資料を対象とした形態素解析」, 自然言語処理, Vol.20, No.5, pp.727-748 (2013).
References
  • Toshinobu Ogiso, Mamoru Komachi and Yuji Matsumoto. Morphological Analysis of Historical Japanese Text, Journal of Natural Language Processing, Vol.20, No.5, pp.727-748 (2013). [in Japanese]
  • Tomoaki Kouno and Toshinobu Ogiso. Improving an Electronic Dictionary for Morphological Analysis of Japanese: Use of historical period information, In Proceedings of The 9th International Conference of ASIALEX (ASIALEX2015) (2015). [can not read online]
中古和文UniDic
UniDic for Heian Period Japanese unidic_chj

This is an analysis dictionary for automated Short Unit Word analysis of middle-old Japanese text. You can use it to parse text, such as the following:

Kokin Wakashu, Tosa Nikki, Taketori Monogatari, Ise Monogatari, Rakubo Monogatari, Yamato Monogatari, Makura no Soshi, Genji Monogatari, Murasaki Shikibu Nikki, Izumi Shikibu Diary, Heichu Monogatari, Tsutsumi Chunagon Monogatari, Sarashina Diary, Sanuki Naishi no suke Nikki, Kagero Nikki, Okagami, etc.

If you intend to use the product for profit, please consult the following contact points beforehand.

When publicising the results of a study conducted using this dictionary for analysis, clearly state this fact. Refer to the literature in the references as needed.

Licence
クリエイティブ・コモンズ・ライセンス

unidic_chj_download_button

Old versions

参考文献 (References)
  • 小木曽 智信, 小椋 秀樹, 田中 牧郎, 近藤 明日子, 伝 康晴: 「中古和文を対象とした形態素解析辞書の開発」, 情報処理学会研究報告 人文科学とコンピュータ, Vol.2010-CH-85, No.4, pp.1-8 (2010).
  • 小木曽智信: 「中古仮名文学作品の形態素解析」, 日本語の研究, Vol.9, No.4, pp.49-6 (2013).
  • 小木曽 智信, 小町 守, 松本 裕治: 「歴史的日本語資料を対象とした形態素解析」, 自然言語処理, Vol.20, No.5, pp.727-748 (2013).
References
  • Toshinobu Ogiso, Mamoru Komachi, Yasuharu Den and Yuji Matsumoto. UniDic for Early Middle Japanese: a Dictionary for Morphological Analysis of Classical Japanese, In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), pp.911-915 (2012).
上代(万葉集) UniDic
UniDic for Nara Period Japanese unidic_chj

This is an analysis dictionary for automated short-unit analysis of the Man'yōshū poetry collection.

If you intend to use the product for profit, please consult the following contact points beforehand.

When publicising the results of a study conducted using this dictionary for analysis, clearly state this fact. Refer to the literature in the references as needed.

Licence
クリエイティブ・コモンズ・ライセンス

unidic_chj_download_button

Old versions

参考文献 (References)
  • 小木曽 智信, 小町 守, 松本 裕治: 「歴史的日本語資料を対象とした形態素解析」, 自然言語処理, Vol.20, No.5, pp.727-748 (2013).
References
  • Toshinobu Ogiso, Mamoru Komachi and Yuji Matsumoto. Morphological Analysis of Historical Japanese Text, Journal of Natural Language Processing, Vol.20, No.5, pp.727-748 (2013). [in Japanese]
  • Tomoaki Kouno and Toshinobu Ogiso. Improving an Electronic Dictionary for Morphological Analysis of Japanese: Use of historical period information, In Proceedings of The 9th International Conference of ASIALEX (ASIALEX2015) (2015). [can not read online]