PHP Classes

Fast Chinese Word Segmentation: Segment Chinese text using the RMM approach

Recommend this page to a friend!
  Info   Screenshots   View files Files   Install with Composer Install with Composer   Download Download   Reputation   Support forum   Blog    
Ratings Unique User Downloads Download Rankings
Not yet rated by the usersTotal: 630 All time: 5,033 This week: 673Up
Version License Categories
fcws 1.0.0Free for non-comm...Text processing
Description 

Author

This package is specific mainly for applications used in China China .

This class can segment Chinese text.

It uses the RMM (reverse maximum match) approach. Therefore it may commit some mistakes that cannot be avoided with perfection.

It handles English but in a very simple way.

Innovation Award
PHP Programming Innovation award nominee
July 2005
Number 9
Chinese is a language that is becoming more and more relevant on the Internet due to the growth of the Chinese economy. This growth is making it possible for many Chinese speaking people becoming Internet users.

The Chinese language words are actually individual symbols. Certain encodings may include ASCII characters allowing for words in other languages to be mixed in Chinese documents.

This class provides a solution to break a Chinese text in a way that it avoids breaking English words that may be mixed with Chinese symbols.

Manuel Lemos
Picture of Wudi
Name: Wudi <contact>
Classes: 5 packages by
Country: China China
Innovation award
Innovation award
Nominee: 2x

Screenshots (1)  
  • screenshot.png
  Files folder image Files (5)  
File Role Description
Plain text file cwordseg_fast.lib.php Class Class
HTML file Readme_CN.htm Doc. Readme (Chinese)
HTML file Readme_EN.htm Doc. Readme (English)
Plain text file test.php Example Test

The PHP Classes site has supported package installation using the Composer tool since 2013, as you may verify by reading this instructions page.
Install with Composer Install with Composer
 Version Control Unique User Downloads Download Rankings  
 0%
Total:630
This week:0
All time:5,033
This week:673Up