Loading...

    AD: 猛买 | 快递查询 | Jobsdigg | 很棒的男装店

Playing with RMMSeg

可任意转载,但必须在醒目位置以超链接形式标明文章原始出处和作者信息
原文地址:http://www.blogkid.net/archives/1334.html

As I mentioned before, RMMSeg is a great tool to analyze Chinese contents. Today I did some test, only for fun.

To install RMMSeg, just type in shell:

gem install rmmseg

Or, it you get the “uninitialized constant Gem::GemRunner (NameError)” error, try:

gem1.8 install rmmseg

Once finished, we can easily call the powerful analyzer like this:

root@:~# echo “我爱北京天安门” | rmmseg
我爱 北京 天安门
root@:~# echo “blogkid爱北京天安门” | rmmseg
blogkid 爱 北京 天安门

root@:~# echo “2005年进入杭州电子科技大学软件工程专业” | rmmseg
2005 年 进入 杭州 电子 科技 大学 软件 工程 专业

Hmmm, RMMSeg’s dictionary do not contain the word “软件工程” (so it was splitted to “软件” and “工程”), but we can add it by hand (Not recommended).

vim /path_to_ruby/gems/1.8/gems/rmmseg-0.1.6/data/words.dic

You’ll see a list of words. Just add “软件工程” as a new line, save and exit.

root@:~# echo “2005年进入杭州电子科技大学软件工程专业” | rmmseg
2005 年 进入 杭州 电子 科技 大学 软件工程 专业

Now the whole “软件工程” comes.

Thanks to pluskid.

0 Responses to “Playing with RMMSeg”


  1. No Comments

Leave a Reply