Flourish PHP Unframework

fUTF8

static class, v1.0.0b16

Provides string functions for UTF-8 strings

This class is implemented to provide a UTF-8 version of almost every built-in PHP string function. For more information about UTF-8, please visit http://flourishlib.com/docs/UTF-8.

Changes:
1.0.0b16Added code to clean() to use mbstring if available since recent versions of iconv and IGNORE now return FALSE for bad encodings 9/21/12
1.0.0b15Fixed a bug with using IBM's iconv implementation on AIX 7/29/11
1.0.0b14Added a workaround for iconv having issues in MAMP 1.9.4+ 7/26/11
1.0.0b13Fixed notices from being thrown when invalid data is sent to clean() 6/10/11
1.0.0b12Fixed a variable name typo in sub() 5/9/11
1.0.0b11Updated the class to not using phpinfo() to determine the iconv implementation 11/4/10
1.0.0b10Fixed a bug with capitalizing a lowercase i resulting in a dotted upper-case I 11/1/10
1.0.0b9Updated class to use fCore::startErrorCapture() instead of error_reporting() 8/9/10
1.0.0b8Removed e flag from preg_replace() calls 6/8/10
1.0.0b7Added the methods trim(), rtrim() and ltrim() 5/11/10
1.0.0b6Fixed clean() to work with PHP installs that use an iconv library that doesn't support IGNORE 3/2/10
1.0.0b5Changed ucwords() to also uppercase words right after various punctuation 9/18/09
1.0.0b4Changed replacement values in preg_replace() calls to be properly escaped 6/11/09
1.0.0b3Fixed a parameter name in rpos() from $search to $needle 2/6/09
1.0.0b2Fixed a bug in explode() with newlines and zero-length delimiters 2/5/09
1.0.0bThe initial implementation 6/1/08

Static Methods

::ascii() internal public

Please note: this method is public, however it is primarily intended for internal use by Flourish and will normally not be useful in site/application code

Maps UTF-8 ASCII-based latin characters, puntuation, symbols and number forms to ASCII

Any characters or symbols that can not be translated will be removed.

This function is most useful for situation that only allows ASCII, such as in URLs.

Translates elements form the following unicode blocks:

  • Latin-1 Supplement
  • Latin Extended-A
  • Latin Extended-B
  • IPA Extensions
  • Latin Extended Additional
  • General Punctuation
  • Letterlike symbols
  • Number Forms

Signature

string ascii( string $string )

Parameters

string $string The string to convert

Returns

The input string in pure ASCII

::chr() public

Converts a unicode value into a UTF-8 character

Signature

string chr( mixed $unicode_code_point )

Parameters

mixed $unicode_code_point The character to create, either the U+hex or decimal code point

Returns

The UTF-8 character

::clean() public

Removes any invalid UTF-8 characters from a string or array of strings

Signature

string clean( array|string $value )

Parameters

array|string $value The string or array of strings to clean

Returns

The cleaned string

::cmp() public

Compares strings, with the resulting order having latin characters that are based on ASCII letters placed after the relative ASCII characters

Please note that this function sorts based on English language sorting rules only. Locale-sepcific sorting is done by strcoll(), however there are technical limitations.

Signature

integer cmp( string $str1, string $str2 )

Parameters

string $str1 The first string to compare
string $str2 The second string to compare

Returns

< 0 if $str1 < $str2, 0 if they are equal, > 0 if $str1 > $str2

::explode() public

Explodes a string on a delimiter

If no delimiter is provided, the string will be exploded with each characters being an element in the array.

Signature

array explode( string $string, string $delimiter=NULL )

Parameters

string $string The string to explode
string $delimiter The string to explode on. If NULL or '' this method will return one character per array index.

Returns

The exploded string

::icmp() public

Compares strings in a case-insensitive manner, with the resulting order having characters that are based on ASCII letters placed after the relative ASCII characters

Please note that this function sorts based on English language sorting rules only. Locale-sepcific sorting is done by strcoll(), however there are technical limitations.

Signature

integer icmp( string $str1, string $str2 )

Parameters

string $str1 The first string to compare
string $str2 The second string to compare

Returns

< 0 if $str1 < $str2, 0 if they are equal, > 0 if $str1 > $str2

::inatcmp() public

Compares strings using a natural order algorithm in a case-insensitive manner, with the resulting order having latin characters that are based on ASCII letters placed after the relative ASCII characters

Please note that this function sorts based on English language sorting rules only. Locale-sepcific sorting is done by strcoll(), however there are technical limitations.

Signature

integer inatcmp( string $str1, string $str2 )

Parameters

string $str1 The first string to compare
string $str2 The second string to compare

Returns

< 0 if $str1 < $str2, 0 if they are equal, > 0 if $str1 > $str2

::ipos() public

Finds the first position (in characters) of the search value in the string - case is ignored when doing performing a match

Signature

mixed ipos( string $haystack, string $needle, integer $offset=0 )

Parameters

string $haystack The string to search in
string $needle The string to search for. This match will be done in a case-insensitive manner.
integer $offset The character position to start searching from

Returns

The integer character position of the first occurence of the needle or FALSE if no match

::ireplace() public

Replaces matching parts of the string, with matches being done in a a case-insensitive manner

If $search and $replace are both arrays and $replace is shorter, the extra $search string will be replaced with an empty string. If $search is an array and $replace is a string, all $search values will be replaced with the string specified.

Signature

string ireplace( string $string, mixed $search, mixed $replace )

Parameters

string $string The string to perform the replacements on
mixed $search The string (or array of strings) to search for - see method description for details
mixed $replace The string (or array of strings) to replace with - see method description for details

Returns

The input string with the specified replacements

::irpos() public

Finds the last position (in characters) of the search value in the string - case is ignored when doing performing a match

Signature

mixed irpos( string $haystack, string $needle, integer $offset=0 )

Parameters

string $haystack The string to search in
string $needle The string to search for. This match will be done in a case-insensitive manner.
integer $offset The character position to start searching from. A negative value will stop looking that many characters from the end of the string

Returns

The integer character position of the last occurence of the needle or FALSE if no match

::istr() public

Matches a string needle in the string haystack, returning a substring from the beginning of the needle to the end of the haystack

Can optionally return the part of the haystack before the needle. Matching is done in a case-insensitive manner.

Signature

mixed istr( string $haystack, string $needle, boolean $before_needle=FALSE )

Parameters

string $haystack The string to search in
string $needle The string to search for. This match will be done in a case-insensitive manner.
boolean $before_needle If a substring of the haystack before the needle should be returned instead of the substring from the needle to the end of the haystack

Returns

The specified part of the haystack, or FALSE if the needle was not found

::len() public

Determines the length (in characters) of a string

Signature

integer len( string $string )

Parameters

string $string The string to measure

Returns

The number of characters in the string

::lower() public

Converts all uppercase characters to lowercase

Signature

string lower( string $string )

Parameters

string $string The string to convert

Returns

The input string with all uppercase characters in lowercase

::ltrim() public

Trims whitespace, or any specified characters, from the beginning of a string

Signature

string ltrim( string $string, string $charlist=NULL )

Parameters

string $string The string to trim
string $charlist The characters to trim

Returns

The trimmed string

::natcmp() public

Compares strings using a natural order algorithm, with the resulting order having latin characters that are based on ASCII letters placed after the relative ASCII characters

Please note that this function sorts based on English language sorting rules only. Locale-sepcific sorting is done by strcoll(), however there are technical limitations.

Signature

integer natcmp( string $str1, string $str2 )

Parameters

string $str1 The first string to compare
string $str2 The second string to compare

Returns

< 0 if $str1 < $str2, 0 if they are equal, > 0 if $str1 > $str2

::ord() public

Converts a UTF-8 character to a unicode code point

Signature

string ord( string $character )

Parameters

string $character The character to decode

Returns

The U+hex unicode code point for the character

::pad() public

Pads a string to the number of characters specified

Signature

string pad( string $string, integer $pad_length, string $pad_string=' ', string $pad_type='right' )

Parameters

string $string The string to pad
integer $pad_length The character length to pad the string to
string $pad_string The string to pad the source string with
string $pad_type The type of padding to do: 'left', 'right', 'both'

Returns

The input string padded to the specified character width

::pos() public

Finds the first position (in characters) of the search value in the string

Signature

mixed pos( string $haystack, string $needle, integer $offset=0 )

Parameters

string $haystack The string to search in
string $needle The string to search for
integer $offset The character position to start searching from

Returns

The integer character position of the first occurence of the needle or FALSE if no match

::replace() public

Replaces matching parts of the string

If $search and $replace are both arrays and $replace is shorter, the extra $search string will be replaced with an empty string. If $search is an array and $replace is a string, all $search values will be replaced with the string specified.

Signature

string replace( string $string, mixed $search, mixed $replace )

Parameters

string $string The string to perform the replacements on
mixed $search The string (or array of strings) to search for - see method description for details
mixed $replace The string (or array of strings) to replace with - see method description for details

Returns

The input string with the specified replacements

::reset() internal public

Please note: this method is public, however it is primarily intended for internal use by Flourish and will normally not be useful in site/application code

Resets the configuration of the class

Signature

void reset( )

::rev() public

Reverses a string

Signature

string rev( string $string )

Parameters

string $string The string to reverse

Returns

The reversed string

::rpos() public

Finds the last position (in characters) of the search value in the string

Signature

mixed rpos( string $haystack, string $needle, integer $offset=0 )

Parameters

string $haystack The string to search in
string $needle The string to search for.
integer $offset The character position to start searching from. A negative value will stop looking that many characters from the end of the string

Returns

The integer character position of the last occurence of the needle or FALSE if no match

::rtrim() public

Trims whitespace, or any specified characters, from the end of a string

Signature

string rtrim( string $string, string $charlist=NULL )

Parameters

string $string The string to trim
string $charlist The characters to trim

Returns

The trimmed string

::str() public

Matches a string needle in the string haystack, returning a substring from the beginning of the needle to the end of the haystack

Can optionally return the part of the haystack before the needle.

Signature

mixed str( string $haystack, string $needle, boolean $before_needle=FALSE )

Parameters

string $haystack The string to search in
string $needle The string to search for
boolean $before_needle If a substring of the haystack before the needle should be returned instead of the substring from the needle to the end of the haystack

Returns

The specified part of the haystack, or FALSE if the needle was not found

::sub() public

Extracts part of a string

Signature

mixed sub( string $string, integer $start, integer $length=NULL )

Parameters

string $string The string to extract from
integer $start The zero-based starting index to extract from. Negative values will start the extraction that many characters from the end of the string.
integer $length The length of the string to extract. If an empty value is provided, the remainder of the string will be returned.

Returns

The extracted subtring or FALSE if the start is out of bounds

::trim() public

Trims whitespace, or any specified characters, from the beginning and end of a string

Signature

string trim( string $string, string $charlist=NULL )

Parameters

string $string The string to trim
string $charlist The characters to trim, .. indicates a range

Returns

The trimmed string

::ucfirst() public

Converts the first character of the string to uppercase.

Signature

string ucfirst( string $string )

Parameters

string $string The string to process

Returns

The processed string

::ucwords() public

Converts the first character of every word to uppercase

Words are considered to start at the beginning of the string, or after any whitespace character.

Signature

string ucwords( string $string )

Parameters

string $string The string to process

Returns

The processed string

::upper() public

Converts all lowercase characters to uppercase

Signature

string upper( string $string )

Parameters

string $string The string to convert

Returns

The input string with all lowercase characters in uppercase

::wordwrap() public

Wraps a string to a specific character width

Signature

string wordwrap( string $string, integer $width=75, string $break="\n", boolean $cut=FALSE )

Parameters

string $string The string to wrap
integer $width The character width to wrap to
string $break The string to insert as a break
boolean $cut If words longer than the character width should be split to fit

Returns

The input string with all lowercase characters in uppercase