0

0

介绍一组中文处理工具函数

php中文网

php中文网

发布时间:2016-06-21 09:12:47

|

920人浏览过

|

来源于php中文网

原创

函数|中文


/*    中文处理工具函数
--- 空格 ---
      string GBspace(string) --------- 每个中文字之间加空格
      string GBunspace(string) ------- 每个中文字之间的空格清除
      string clear_space(string) ------- 用来清除多余的空格

--- 转换 ---
      string GBcase(string,offset) --- 将字符串内的中英文字转换大小写
                              offset : "upper"   - 字符串全转为大写 (strtoupper)
                                       "lower"   - 字符串全转为小写 (strtolower)
                                       "ucwords" - 将字符串每个字第一个字母改大写 (ucwords)
                                       "ucfirst" - 将字符串第一个字母改大写 (ucfirst)
      string GBrev(string) ----------- 颠倒字符串

--- 文字检查 ---
      int GB_check(string) ----------- 检查字符串内是否有 GB 字,有会返回 true,
                                         否则会返回false
      int GB_all(string) ------------- 检查字符串内所有字是否有 GB 字,是会返回 true,
                                         否则会返回false
      int GB_non(string) ------------- 检查字符串内所有字并不是 GB 字,是会返回 true,
                                         否则会返回false
      int GBlen(string) -------------- 返回字符串长度(中文字只计一字母)

--- 查找、取代、提取 ---
      int/array GBpos(haystack,needle,[offset]) ---- 查找字符串 (strpos)
                              offset : 留空 - 查找第一个出现的位置
                                       int  - 由该位置搜索出现的第一个位置
                                       "r"  - 查找最后一次出现的位置 (strrpos)
                                       "a"  - 将所有查找到的字储存为数组(返回 array)

      string GB_replace(needle,str,haystack) -- 查找与取代字符串 (str_replace)
      string GB_replace_i(needle,str_f,str_b,haystack) -- 不检查大小写查找与取代字符串
                                         needle - 查找字母
                                         str - 取代字母 ( str_f - 该字母前, str_b 该字母后)
                                         haystack - 字符串

      string GBsubstr(string,start,[length]) -- 从string提取出由开始到结尾或长度
                                                  length的字符串。
                                                  中文字只计一字母,可使用正负数。
      string GBstrnear(string,length)         -- 从 string提取最接近 length的字符串。
                                                   length 中 中文字计2个字母。

--- 注意 ---
      如使用由 Form 返回的字符串前,请先替字符串经过 stripslashes() 处理,除去多余的 \ 。

      用法:在原 PHP 代码内加上:
      include ("GB.inc");
      即可使用以上工具函数。
*/

function GBlen($string) {
    $l = strlen($string);
    $ptr = 0;
    $a = 0;
    while ($a         $ch = substr($string,$a,1);
        $ch2 = substr($string,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            $ptr++;
            $a += 2;
        } else {
            $ptr++;
            $a++;
        } // END IF
    } // END WHILE

    return $ptr;
}

function GBsubstr($string,$start,$length) {
    if (!is_int($length) && $length != "") {
        return "错误:length 值错误(必须为数值)。
";
    } elseif ($length == "0") {
        return "";
    } else {
    $l = strlen($string);
    $a = 0;
    $ptr = 0;
    $str_list = array();
    $str_list2 = array();
    while ($a         $ch = substr($string,$a,1);
        $ch2 = substr($string,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            $str_list[$ptr] = $a;
            $str_list2[$ptr] = $a+1;
            $ptr++;
            $a += 2;
        } else {
            $str_list[$ptr] = $a;
            $str_list2[$ptr] = $a;
            $ptr++;
            $a++;
        } // END IF
    } // END WHILE

    if ($start > $ptr || -$start > $ptr) {
        return;
    } elseif ($length == "") {
        if ($start >= 0) { // (text,+)
            return substr($string,$str_list[$start]);
        } else { // (test,-)
            return substr($string,$str_list[$ptr + $start]);
        }
    } else {

        if ($length > 0) { // $length > 0


            if ($start >= 0) {  // (text,+,+)
                if (($start + $length) >= count($str_list2)) {
                    return substr($string,$str_list[$start]);
                } else { //(text,+,+)
                    $end = $str_list2[$start + ($length - 1)] - $str_list[$start] +1;
                    return substr($string,$str_list[$start],$end);
                }

            } else { // (text ,-,+)
                $start = $ptr + $start;
                if (($start + $length) >= count($str_list2)) {
                    return substr($string,$str_list[$start]);
                } else {
                    $end = $str_list2[$start + ($length - 1)] - $str_list[$start] +1;
                    return substr($string,$str_list[$start],$end);
                }
            }

        } else { // $length             $end = strlen($string) - $str_list[$ptr+$length];
            if ($start >= 0) {  // (text,+,-) {
                return substr($string,$str_list[$start],-$end);
            } else { //(text,-,-)
                $start = $ptr + $start;
                return substr($string,$str_list[$start],-$end);
            }

        } // END OF LENGTH > /
    }
    } // END IF
}

function GB_replace($needle,$string,$haystack) {
    $l = strlen($haystack);
    $l2 = strlen($needle);
    $l3 = strlen($string);
    $news = "";
    $skip = 0;
    $a = 0;
    while ($a         $ch = substr($haystack,$a,1);
        $ch2 = substr($haystack,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            if (substr($haystack,$a,$l2) == $needle) {
                $news .= $string;
                $a += $l2;
            } else {
                $news .= $ch.$ch2;
                $a += 2;
            }
        } else {
            if (substr($haystack,$a,$l2) == $needle) {
                $news .= $string;
                $a += $l2;
            } else {
                $news .= $ch;
                $a++;
            }
        } // END IF
    } // END WHILE
    return $news;
}

function GB_replace_i($needle,$str_f,$str_b,$haystack) {

    $l = strlen($haystack);
    $l2 = strlen($needle);
    $l3 = strlen($string);
    $news = "";
    $skip = 0;
    $a = 0;
    while ($a         $ch = substr($haystack,$a,1);
        $ch2 = substr($haystack,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            if (GBcase(substr($haystack,$a,$l2),"lower") == GBcase($needle,"lower")) {
                $news .= $str_f . substr($haystack,$a,$l2) . $str_b;
                $a += $l2;
            } else {
                $news .= $ch.$ch2;
                $a += 2;
            }
        } else {
            if (GBcase(substr($haystack,$a,$l2),"lower") == GBcase($needle,"lower")) {
                $news .= $str_f . substr($haystack,$a,$l2) . $str_b;
                $a += $l2;
            } else {
                $news .= $ch;
                $a++;
            }
        } // END IF
    } // END WHILE
    return $news;
}



function GBpos($haystack,$needle,$offset) {
    if (!is_int($offset)) {
        $offset = strtolower($offset);
        if ($offset != "" && $offset != "r" && $offset != "a") {
            return "错误:offset 值错误。
";
        }
    }
    $l = strlen($haystack);
    $l2 = strlen($needle);
    $found = false;
    $w = 0; // WORD
    $a = 0; // START

    if ($offset == "" || $offset == "r") {
        $atleast = 0;
        $value = false;
    } elseif ($offset == "a") {
        $value = array();
        $atleast = 0;
    } else {
        $value = false;
        $atleast = $offset;
    }
    while ($a         $ch = substr($haystack,$a,1);
        $ch2 = substr($haystack,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40") && $skip == 0) {
            if (substr($haystack,$a,$l2) == $needle) {
                if ($offset == "r") {
                    $found = true;
                    $value = $w;
                } elseif ($offset == "a") {
                    $found = true;
                    $value[] = $w;
                } elseif (!$value) {
                    if ($w >= $atleast) {
                        $found = true;
                        $value = $w;
                    }
                }
            }
            $a += 2;
        } else {
            if (substr($haystack,$a,$l2) == $needle) {
                if ($offset == "r") {
                    $found = true;
                    $value = $w;
                } elseif ($offset == "a") {
                    $found = true;
                    $value[] = $w;
                } elseif (!$value) {
                    if ($w >= $atleast) {
                        $found = true;
                        $value = $w;
                    }
                }
            }
            $a++;
        }
        $w++;
    } // END OF WHILE
    if ($found) {
        return $value;
    } else {
        return $false;
    }
//    } // END OF WHILE

}

function GBrev($text) {
    $news = "";
    $l = strlen($text);
    $GB = 0;
    $a = 0;
    while ($a         $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40") && $skip == 0) {
            $a += 2;
            $news = $ch . $ch2 . $news;
        } else {
            $news = $ch . $news;
            $a++;
        }
    }
    return $news;
}

function GB_check($text) {
    $l = strlen($text);
    $a = 0;
    while ($a         $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            return true;
        } else {
            return false;
        }
    }
}

function GB_all ($text) {
    $l = strlen($text);
    $all = 1;
    $a = 0;
    while ($a         $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            $a += 2;
        } else {
            $a++;
            $all = 0;
        }
    }
    if ($all == 1) {
        return true;
    } else {
        return false;
    }
}

function GB_non ($text) {
    $l = strlen($text);
    $all = 1;
    $a = 0;
    while ($a         $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
            $a += 2;
            $all = 0;
        } else {
            $a++;
        }
    }
    if ($all == 1) {
        return true;
    } else {
        return false;
    }
}


function GBcase ($text,$case) {
    $case = strtolower($case);
    if ($case != "upper" && $case != "lower" && $case != "ucwords" && $case != "ucfirst") {
        return "函数用法错误。 $case";
    } else {
    $ucfirst = 0;
    $ucwords = 0;
    $news = "";
    $l = strlen($text);
    $GB = 0;
    $english = 0;

    $a = 0;
    while ($a
    $ch = substr($text,$a,1);
    if ($GB == 0 && ord($ch) >= HexDec("0x81")) {

            $GB = 1;
            $english = 0;
            $news .= $ch;
            $ucwords = 0;
    
    } elseif ($GB == 1 && ord($ch) >= HexDec("0x40") && $english == 0) {
            $news .= "$ch";    
            $ucwords = 0;
            $GB = 0;

    } else {
        if ($case == "upper") {
            $news .= strtoupper($ch);
        } elseif ($case == "lower") {
            $news .= strtolower($ch);
        } elseif ($case == "ucwords") {
            if ($ucwords == 0) {
                $news .= strtoupper($ch);
            } else {
                $news .= strtolower($ch);
            }
            $ucwords = 1;
        } elseif ($case == "ucfirst") {
            if ($ucfirst == 0) {
                $news .= strtoupper($ch);
                $ucfirst = 1;
            } else {
                $news .= strtolower($ch);
                $ucfirst = 1;
            }
        } else {
            $news .= $ch;
        }
        if ($ch == " " || $ch == "\n") {
            $ucwords = 0;
        }
        $english = 1;
        $GB = 0;
    
    }

    $a++;
    
    } // END OF while
    return $news;
    } // end else
}



function GBspace ($text) {

    $news = "";
    $l = strlen($text);
    $GB = 0;
    $english = 0;

    $a = 0;
    while ($a

    $ch = substr($text,$a,1);
    $ch2 = substr($text,$a+1,1);
    if (!($ch == " " && $ch2 == " ")) {
    if ($GB == 0) {
        if (ord($ch) >= HexDec("0x81")) {

            if ($english == 1) {
                if ((substr($text,$a-1,1) == " ") || (substr($text,$a-1,1) == "\n")) {
                    $news .= "$ch";
                } else {
                    $news .= " $ch";
                }
                $english = 0;
                $GB = 1;
            } else {
                $GB = 1;
                $english = 0;
                $news .= $ch;
            }
        } else {
            $english = 1;
            $GB = 0;
            $news .= $ch;
        }
        
    } else {
        if (ord($ch) >= HexDec("0x40")) {
            if ($english == 0) {
                if ((substr($text,$a+1,1) == " ")|| (substr($text,$a+1,1) == "\n")) {
                    $news .= "$ch";    
                } else {
                    $news .= "$ch ";
                }
            } else {
                $news .= " $ch";
            }
        } else {
            $english = 1;
            $news .= "$ch";
        }
        $GB = 0;
    }
    }
    $a++;
    } // END OF while

    // Chk 1 & last is space

    $l = strlen($news);
    if (substr($news,0,1) == " ") {
        $news = substr($news,1);
    }
    $l = strlen($news);
    if (substr($news,$l-1,1) == " ") {
        $news = substr($news,0,$l-1);
    }
    return $news;
}

function GBunspace($text) {
    $news = "";
    $l = strlen($text);
    $a = 0;
    $last_space = 1;
    while ($a     
        $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        $ch3 = substr($text,$a+2,1);
        if (($a + 1) == $l ) {
            $last_space = 1;
        }
        if ($ch == " ") {
            if ($last_space == 0) {
            if (ord($ch2) >= HexDec("0x81") && ord($ch3) >= HexDec("0x40")) {
                if ($chi == 0) {
                    $news .= " ";
                    $last_space = 1;
                }
                $chi=1;


            } elseif ($ch2 != " ") {
                $news .= " ";
                $chi = 0;
                $last_space = 1;
            }
            }
        } else {
            if (ord($ch) >= HexDec("0x81") && ord($ch2) >= HexDec("0x40")) {
                $chi = 1;
                $a++;
                $news .= $ch . $ch2;
                $last_space = 0;

            } else {
                $chi = 0;
                $news .= $ch;
                $last_space = 0;
            }

        }
    $a++;
    }
    // Chk 1 & last is space

    $l = strlen($news);
    if (substr($news,0,1) == " ") {
        $news = substr($news,1);
    }
    $l = strlen($news);
    if (substr($news,$l-1,1) == " ") {
        $news = substr($news,0,$l-1);
    }
    return $news;



} // END OF Function

function GBstrnear($text,$length) {

$tex_len = strlen($text);
$a = 0;
$w = "";
while ($a         $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        if (GB_all($ch.$ch2)) {
            $w .= $ch.$ch2;
            $a=$a+2;
        } else {
            $w .= $ch;
            $a++;
        }
        if ($a == $length || $a == ($length - 1)) {
            $a = $tex_len;
        }
}
return $w;
} // END OF FUNCTION

function clear_space($text) {
    $t = "";
    for ($a=0;$a        $ch = substr($text,$a,1);
        $ch2 = substr($text,$a+1,1);
        if ($ch == " " && $ch2 == " ") {
        } else {
            $t .= $ch;
        }
    }
    return $t;
}


?>



本站声明:本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn

热门AI工具

更多
DeepSeek
DeepSeek

幻方量化公司旗下的开源大模型平台

豆包大模型
豆包大模型

字节跳动自主研发的一系列大型语言模型

通义千问
通义千问

阿里巴巴推出的全能AI助手

腾讯元宝
腾讯元宝

腾讯混元平台推出的AI助手

文心一言
文心一言

文心一言是百度开发的AI聊天机器人,通过对话可以生成各种形式的内容。

讯飞写作
讯飞写作

基于讯飞星火大模型的AI写作工具,可以快速生成新闻稿件、品宣文案、工作总结、心得体会等各种文文稿

即梦AI
即梦AI

一站式AI创作平台,免费AI图片和视频生成。

ChatGPT
ChatGPT

最最强大的AI聊天机器人程序,ChatGPT不单是聊天机器人,还能进行撰写邮件、视频脚本、文案、翻译、代码等任务。

相关专题

更多
AO3官网入口与中文阅读设置 AO3网页版使用与访问
AO3官网入口与中文阅读设置 AO3网页版使用与访问

本专题围绕 Archive of Our Own(AO3)官网入口展开,系统整理 AO3 最新可用官网地址、网页版访问方式、正确打开链接的方法,并详细讲解 AO3 中文界面设置、阅读语言切换及基础使用流程,帮助用户稳定访问 AO3 官网,高效完成中文阅读与作品浏览。

45

2026.02.02

主流快递单号查询入口 实时物流进度一站式追踪专题
主流快递单号查询入口 实时物流进度一站式追踪专题

本专题聚合极兔快递、京东快递、中通快递、圆通快递、韵达快递等主流物流平台的单号查询与运单追踪内容,重点解决单号查询、手机号查物流、官网入口直达、包裹进度实时追踪等高频问题,帮助用户快速获取最新物流状态,提升查件效率与使用体验。

8

2026.02.02

Golang WebAssembly(WASM)开发入门
Golang WebAssembly(WASM)开发入门

本专题系统讲解 Golang 在 WebAssembly(WASM)开发中的实践方法,涵盖 WASM 基础原理、Go 编译到 WASM 的流程、与 JavaScript 的交互方式、性能与体积优化,以及典型应用场景(如前端计算、跨平台模块)。帮助开发者掌握 Go 在新一代 Web 技术栈中的应用能力。

4

2026.02.02

PHP Swoole 高性能服务开发
PHP Swoole 高性能服务开发

本专题聚焦 PHP Swoole 扩展在高性能服务端开发中的应用,系统讲解协程模型、异步IO、TCP/HTTP/WebSocket服务器、进程与任务管理、常驻内存架构设计。通过实战案例,帮助开发者掌握 使用 PHP 构建高并发、低延迟服务端应用的工程化能力。

3

2026.02.02

Java JNI 与本地代码交互实战
Java JNI 与本地代码交互实战

本专题系统讲解 Java 通过 JNI 调用 C/C++ 本地代码的核心机制,涵盖 JNI 基本原理、数据类型映射、内存管理、异常处理、性能优化策略以及典型应用场景(如高性能计算、底层库封装)。通过实战示例,帮助开发者掌握 Java 与本地代码混合开发的完整流程。

3

2026.02.02

go语言 注释编码
go语言 注释编码

本专题整合了go语言注释、注释规范等等内容,阅读专题下面的文章了解更多详细内容。

62

2026.01.31

go语言 math包
go语言 math包

本专题整合了go语言math包相关内容,阅读专题下面的文章了解更多详细内容。

55

2026.01.31

go语言输入函数
go语言输入函数

本专题整合了go语言输入相关教程内容,阅读专题下面的文章了解更多详细内容。

27

2026.01.31

golang 循环遍历
golang 循环遍历

本专题整合了golang循环遍历相关教程,阅读专题下面的文章了解更多详细内容。

33

2026.01.31

热门下载

更多
网站特效
/
网站源码
/
网站素材
/
前端模板

精品课程

更多
相关推荐
/
热门推荐
/
最新课程
Excel 教程
Excel 教程

共162课时 | 15.1万人学习

Pandas 教程
Pandas 教程

共15课时 | 1万人学习

C# 教程
C# 教程

共94课时 | 8.3万人学习

关于我们 免责申明 举报中心 意见反馈 讲师合作 广告合作 最新更新
php中文网:公益在线php培训,帮助PHP学习者快速成长!
关注服务号 技术交流群
PHP中文网订阅号
每天精选资源文章推送

Copyright 2014-2026 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号