Provides string functions for UTF-8 strings
This class is implemented to provide a UTF-8 version of almost every built-in PHP string function. For more information about UTF-8, please visit http://flourishlib.com/docs/UTF-8.
1.0.0b16 | Added code to clean() to use mbstring if available since recent versions of iconv and IGNORE now return FALSE for bad encodings 9/21/12 |
---|---|
1.0.0b15 | Fixed a bug with using IBM's iconv implementation on AIX 7/29/11 |
1.0.0b14 | Added a workaround for iconv having issues in MAMP 1.9.4+ 7/26/11 |
1.0.0b13 | Fixed notices from being thrown when invalid data is sent to clean() 6/10/11 |
1.0.0b12 | Fixed a variable name typo in sub() 5/9/11 |
1.0.0b11 | Updated the class to not using phpinfo() to determine the iconv implementation 11/4/10 |
1.0.0b10 | Fixed a bug with capitalizing a lowercase i resulting in a dotted upper-case I 11/1/10 |
1.0.0b9 | Updated class to use fCore::startErrorCapture() instead of error_reporting() 8/9/10 |
1.0.0b8 | Removed e flag from preg_replace() calls 6/8/10 |
1.0.0b7 | Added the methods trim(), rtrim() and ltrim() 5/11/10 |
1.0.0b6 | Fixed clean() to work with PHP installs that use an iconv library that doesn't support IGNORE 3/2/10 |
1.0.0b5 | Changed ucwords() to also uppercase words right after various punctuation 9/18/09 |
1.0.0b4 | Changed replacement values in preg_replace() calls to be properly escaped 6/11/09 |
1.0.0b3 | Fixed a parameter name in rpos() from $search to $needle 2/6/09 |
1.0.0b2 | Fixed a bug in explode() with newlines and zero-length delimiters 2/5/09 |
1.0.0b | The initial implementation 6/1/08 |
Please note: this method is public, however it is primarily intended for internal use by Flourish and will normally not be useful in site/application code
Maps UTF-8 ASCII-based latin characters, puntuation, symbols and number forms to ASCII
Any characters or symbols that can not be translated will be removed.
This function is most useful for situation that only allows ASCII, such as in URLs.
Translates elements form the following unicode blocks:
string ascii( string $string )
string | $string | The string to convert |
The input string in pure ASCII
Converts a unicode value into a UTF-8 character
string chr( mixed $unicode_code_point )
mixed | $unicode_code_point | The character to create, either the U+hex or decimal code point |
The UTF-8 character
Removes any invalid UTF-8 characters from a string or array of strings
string clean( array|string $value )
array|string | $value | The string or array of strings to clean |
The cleaned string
Compares strings, with the resulting order having latin characters that are based on ASCII letters placed after the relative ASCII characters
Please note that this function sorts based on English language sorting rules only. Locale-sepcific sorting is done by strcoll(), however there are technical limitations.
integer cmp( string $str1, string $str2 )
string | $str1 | The first string to compare |
string | $str2 | The second string to compare |
< 0 if $str1 < $str2, 0 if they are equal, > 0 if $str1 > $str2
Explodes a string on a delimiter
If no delimiter is provided, the string will be exploded with each characters being an element in the array.
array explode( string $string, string $delimiter=NULL )
string | $string | The string to explode |
string | $delimiter | The string to explode on. If NULL or '' this method will return one character per array index. |
The exploded string
Compares strings in a case-insensitive manner, with the resulting order having characters that are based on ASCII letters placed after the relative ASCII characters
Please note that this function sorts based on English language sorting rules only. Locale-sepcific sorting is done by strcoll(), however there are technical limitations.
integer icmp( string $str1, string $str2 )
string | $str1 | The first string to compare |
string | $str2 | The second string to compare |
< 0 if $str1 < $str2, 0 if they are equal, > 0 if $str1 > $str2
Compares strings using a natural order algorithm in a case-insensitive manner, with the resulting order having latin characters that are based on ASCII letters placed after the relative ASCII characters
Please note that this function sorts based on English language sorting rules only. Locale-sepcific sorting is done by strcoll(), however there are technical limitations.
integer inatcmp( string $str1, string $str2 )
string | $str1 | The first string to compare |
string | $str2 | The second string to compare |
< 0 if $str1 < $str2, 0 if they are equal, > 0 if $str1 > $str2
Finds the first position (in characters) of the search value in the string - case is ignored when doing performing a match
mixed ipos( string $haystack, string $needle, integer $offset=0 )
string | $haystack | The string to search in |
string | $needle | The string to search for. This match will be done in a case-insensitive manner. |
integer | $offset | The character position to start searching from |
The integer character position of the first occurence of the needle or FALSE if no match
Replaces matching parts of the string, with matches being done in a a case-insensitive manner
If $search and $replace are both arrays and $replace is shorter, the extra $search string will be replaced with an empty string. If $search is an array and $replace is a string, all $search values will be replaced with the string specified.
string ireplace( string $string, mixed $search, mixed $replace )
string | $string | The string to perform the replacements on |
mixed | $search | The string (or array of strings) to search for - see method description for details |
mixed | $replace | The string (or array of strings) to replace with - see method description for details |
The input string with the specified replacements
Finds the last position (in characters) of the search value in the string - case is ignored when doing performing a match
mixed irpos( string $haystack, string $needle, integer $offset=0 )
string | $haystack | The string to search in |
string | $needle | The string to search for. This match will be done in a case-insensitive manner. |
integer | $offset | The character position to start searching from. A negative value will stop looking that many characters from the end of the string |
The integer character position of the last occurence of the needle or FALSE if no match
Matches a string needle in the string haystack, returning a substring from the beginning of the needle to the end of the haystack
Can optionally return the part of the haystack before the needle. Matching is done in a case-insensitive manner.
mixed istr( string $haystack, string $needle, boolean $before_needle=FALSE )
string | $haystack | The string to search in |
string | $needle | The string to search for. This match will be done in a case-insensitive manner. |
boolean | $before_needle | If a substring of the haystack before the needle should be returned instead of the substring from the needle to the end of the haystack |
The specified part of the haystack, or FALSE if the needle was not found
Determines the length (in characters) of a string
integer len( string $string )
string | $string | The string to measure |
The number of characters in the string
Converts all uppercase characters to lowercase
string lower( string $string )
string | $string | The string to convert |
The input string with all uppercase characters in lowercase
Trims whitespace, or any specified characters, from the beginning of a string
string ltrim( string $string, string $charlist=NULL )
string | $string | The string to trim |
string | $charlist | The characters to trim |
The trimmed string
Compares strings using a natural order algorithm, with the resulting order having latin characters that are based on ASCII letters placed after the relative ASCII characters
Please note that this function sorts based on English language sorting rules only. Locale-sepcific sorting is done by strcoll(), however there are technical limitations.
integer natcmp( string $str1, string $str2 )
string | $str1 | The first string to compare |
string | $str2 | The second string to compare |
< 0 if $str1 < $str2, 0 if they are equal, > 0 if $str1 > $str2
Converts a UTF-8 character to a unicode code point
string ord( string $character )
string | $character | The character to decode |
The U+hex unicode code point for the character
Pads a string to the number of characters specified
string pad( string $string, integer $pad_length, string $pad_string=' ', string $pad_type='right' )
string | $string | The string to pad |
integer | $pad_length | The character length to pad the string to |
string | $pad_string | The string to pad the source string with |
string | $pad_type | The type of padding to do: 'left', 'right', 'both' |
The input string padded to the specified character width
Finds the first position (in characters) of the search value in the string
mixed pos( string $haystack, string $needle, integer $offset=0 )
string | $haystack | The string to search in |
string | $needle | The string to search for |
integer | $offset | The character position to start searching from |
The integer character position of the first occurence of the needle or FALSE if no match
Replaces matching parts of the string
If $search and $replace are both arrays and $replace is shorter, the extra $search string will be replaced with an empty string. If $search is an array and $replace is a string, all $search values will be replaced with the string specified.
string replace( string $string, mixed $search, mixed $replace )
string | $string | The string to perform the replacements on |
mixed | $search | The string (or array of strings) to search for - see method description for details |
mixed | $replace | The string (or array of strings) to replace with - see method description for details |
The input string with the specified replacements
Please note: this method is public, however it is primarily intended for internal use by Flourish and will normally not be useful in site/application code
Resets the configuration of the class
void reset( )
Reverses a string
string rev( string $string )
string | $string | The string to reverse |
The reversed string
Finds the last position (in characters) of the search value in the string
mixed rpos( string $haystack, string $needle, integer $offset=0 )
string | $haystack | The string to search in |
string | $needle | The string to search for. |
integer | $offset | The character position to start searching from. A negative value will stop looking that many characters from the end of the string |
The integer character position of the last occurence of the needle or FALSE if no match
Trims whitespace, or any specified characters, from the end of a string
string rtrim( string $string, string $charlist=NULL )
string | $string | The string to trim |
string | $charlist | The characters to trim |
The trimmed string
Matches a string needle in the string haystack, returning a substring from the beginning of the needle to the end of the haystack
Can optionally return the part of the haystack before the needle.
mixed str( string $haystack, string $needle, boolean $before_needle=FALSE )
string | $haystack | The string to search in |
string | $needle | The string to search for |
boolean | $before_needle | If a substring of the haystack before the needle should be returned instead of the substring from the needle to the end of the haystack |
The specified part of the haystack, or FALSE if the needle was not found
Extracts part of a string
mixed sub( string $string, integer $start, integer $length=NULL )
string | $string | The string to extract from |
integer | $start | The zero-based starting index to extract from. Negative values will start the extraction that many characters from the end of the string. |
integer | $length | The length of the string to extract. If an empty value is provided, the remainder of the string will be returned. |
The extracted subtring or FALSE if the start is out of bounds
Trims whitespace, or any specified characters, from the beginning and end of a string
string trim( string $string, string $charlist=NULL )
string | $string | The string to trim |
string | $charlist | The characters to trim, .. indicates a range |
The trimmed string
Converts the first character of the string to uppercase.
string ucfirst( string $string )
string | $string | The string to process |
The processed string
Converts the first character of every word to uppercase
Words are considered to start at the beginning of the string, or after any whitespace character.
string ucwords( string $string )
string | $string | The string to process |
The processed string
Converts all lowercase characters to uppercase
string upper( string $string )
string | $string | The string to convert |
The input string with all lowercase characters in uppercase
Wraps a string to a specific character width
string wordwrap( string $string, integer $width=75, string $break="\n", boolean $cut=FALSE )
string | $string | The string to wrap |
integer | $width | The character width to wrap to |
string | $break | The string to insert as a break |
boolean | $cut | If words longer than the character width should be split to fit |
The input string with all lowercase characters in uppercase