logo

NLP-opencc

wangzf / 2022-04-05


目录

介绍

Open Chinese Convert(OpenCC,开放中文转换)是一个开源项目,用于在繁体中文, 简体中文和日文汉字(Shinjitai)之间进行转换

特点

在线演示 Demo(不支持API查询)

Python opencc 安装

# pip install opencc

opencc 使用

基本配置文件

文件名 from => to 翻译
s2t.json Simplified Chinese to Traditional Chinese 简体到繁体
t2s.json Traditional Chinese to Simplified Chinese 繁体到简体
s2tw.json Simplified Chinese to Traditional Chinese(Taiwan Standard) 简体到台湾正体
tw2s.json Traditional Chinese (Taiwan Standard) to Simplified Chinese 台湾正体到简体
s2hk.json Simplified Chinese to Traditional Chinese(Hong Kong Standard) 简体到香港繁体(香港小学学习字词表标准)
hk2s.json Traditional Chinese (Hong Kong Standard) to Simplified Chinese 香港繁体(香港小学学习字词表标准)到简体
s2twp.json Simplified Chinese to Traditional Chinese(Taiwan Standard) with Taiwanese idiom 简体到繁体(台湾正体标准)并转换为台湾常用词汇
tw2sp.json Traditional Chinese (Taiwan Standard) to Simplified Chinese with Mainland Chinese idiom 繁体(台湾正体标准)到简体并转换为中国大陆常用词汇
t2tw.json Traditional Chinese (OpenCC Standard) to Taiwan Standard 繁体(OpenCC标准)到台湾正体
t2hk.json Traditional Chinese (OpenCC Standard) to Hong Kong Standard 繁体(OpenCC标准)到香港繁体(香港小学学习字词表标准)
t2jp.json Traditional Chinese Characters (Kyūjitai) to New Japanese Kanji(Shinjitai) 繁体(OpenCC标准,旧字体)到日文新字体
jp2t.json New Japanese Kanji (Shinjitai) to Traditional Chinese Characters(Kyūjitai) 日文新字体到繁体(OpenCC标准,旧字体)

Python API Demo

import opencc

# 简体中文 => 繁体中文
converter = opencc.OpenCC("t2s.json")
t_data = "漢字"
s_data = converter.convert(t_data)

命令行模式 Demo

opencc --help
opencc_dict --help
opencc_phrase_extract --help