PHP+MYSQL 无法插入emoji表情时的处理

时间:2023-01-11 14:39:18

无意中发现,插入的字符中含有 emoji时,无法成功插入mysql数据库(DB使用 utf8编码).


由于我只有使用来显示微信公众号文章标题,不是很在意,所以可以过滤.

可以使用下列方法过滤掉表情.

  1. /** 
  2.  * 过滤emoji图标 
  3.  * @param $oriStr
  4.  * @return string 
  5.  */  
  6. public static function remove_emoji($oriStr) {  
  7.     $regex = '/(\\\u[ed][0-9a-f]{3})/i';  
  8.     $oriStr = json_encode($oriStr);  
  9.     $oriStr= preg_replace($regex''$oriStr);  
  10.     return json_decode($oriStr);  
  11. }  


public static function removeEmoji($text) {

$clean_text
= "";

// Match Emoticons
$regexEmoticons
= '/[\x{1F600}-\x{1F64F}]/u';
$clean_text
= preg_replace($regexEmoticons, '', $text);

// Match Miscellaneous Symbols and Pictographs
$regexSymbols
= '/[\x{1F300}-\x{1F5FF}]/u';
$clean_text
= preg_replace($regexSymbols, '', $clean_text);

// Match Transport And Map Symbols
$regexTransport
= '/[\x{1F680}-\x{1F6FF}]/u';
$clean_text
= preg_replace($regexTransport, '', $clean_text);

// Match Miscellaneous Symbols
$regexMisc
= '/[\x{2600}-\x{26FF}]/u';
$clean_text
= preg_replace($regexMisc, '', $clean_text);

// Match Dingbats
$regexDingbats
= '/[\x{2700}-\x{27BF}]/u';
$clean_text
= preg_replace($regexDingbats, '', $clean_text);

return $clean_text;
}

其它参考书的方法:

function remove_emoji($text){
return preg_replace('/([0-9#][\x{20E3}])|[\x{00ae}\x{00a9}\x{203C}\x{2047}\x{2048}\x{2049}\x{3030}\x{303D}\x{2139}\x{2122}\x{3297}\x{3299}][\x{FE00}-\x{FEFF}]?|[\x{2190}-\x{21FF}][\x{FE00}-\x{FEFF}]?|[\x{2300}-\x{23FF}][\x{FE00}-\x{FEFF}]?|[\x{2460}-\x{24FF}][\x{FE00}-\x{FEFF}]?|[\x{25A0}-\x{25FF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{FE00}-\x{FEFF}]?|[\x{2900}-\x{297F}][\x{FE00}-\x{FEFF}]?|[\x{2B00}-\x{2BF0}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F6FF}][\x{FE00}-\x{FEFF}]?/u', '', $text);
}

function removeEmojis( $string ) {
$string
= str_replace( "?", "{%}", $string );
$string
= mb_convert_encoding( $string, "ISO-8859-1", "UTF-8" );
$string
= mb_convert_encoding( $string, "UTF-8", "ISO-8859-1" );
$string
= str_replace( array( "?", "? ", " ?" ), array(""), $string );
$string
= str_replace( "{%}", "?", $string );
return trim( $string );
}

Explanation:

  • convert the string from utf-8 to iso-8859-1

  • return back to utf-8 (mb_ function replace invalid characters to ''?''remove non-valid characters )

  • Replace ? to none

  • Return back the ''?'' character from the original string

Make sure you are using UTF-8 to work.


其实已经有个开源转换程序了。
http://code.iamcal.com/php/emoji/
https://github.com/iamcal/php-emoji


本文地址: http://blog.csdn.net/aerchi/article/details/68485987